data science
Massive open data science online courses have generated high expectations regarding whether this new educational paradigm might change the character of higher education. MOOCs provide a vision of education beyond the boundaries of single universities and organizations, with the potential for free access by large groups of learners from any geographical area and without the requirement to meet formal entry conditions. The present popularity of this mode of learning and teaching led the New York Times to name 2012 as the year of the MOOC.
MOOCs tend to be featured in the popular media both as a description of the necessity of universities transforming and as a method of compelling a fresh way of looking at online pedagogy and learning methods. The relevance of processing unprecedented amounts of data about students in MOOCs is of significant concern to researchers working in the field of data science. The particular use of DS in the educational sector is termed educational data science, and this operates with Data collected from learning environments/settings to address educational issues.
EDS is based on earlier subdisciplines of education. At one level, certain studies6 describes the following upcoming core areas for big data and DS in education: formative and situated or classroom assessment, technology-mediated psychometrics, self-regulated learning and metacognition, and analysis of intricate performance and comprehensive disciplinary practice.
Whereas connectivism borrows from earlier data science learning theories of behaviorism, cognition, and constructivism, it argues that such theories are all about the process of learning, and in a world that is technology-networked, we have to think of learning as it occurs beyond individuals like machine learning and database accumulation, and the value of information learned.
On the other hand, the MOOC’s history is largely traced back to David Wiley’s course, Introduction to Open Education, in 2007, and then through a string of courses following an open, networked pattern. These developers of MOOCs were shaped by earlier work in online education, distance learning, online data, science programs, learn python for data science, udacity data analyst nanodegree, data science 365, coding dojo data science and/or learning theory, e.g., cognitive science, a branch of psychology committed to understanding how the brain processes information through thought.
In 2012, another MOOC titled Introduction to Artificial Intelligence gained the attention of scholars with over 160,000 enrolled students, and for the first time, an open online course was deemed massive. This prompted Sebastian Thrum and Peter Norvig to establish a new business model for online knowledge, the start-up Udacity. Within 1 year, two more American start-ups for MOOCs emerged: Coursera and edX. In 2013, the Open University created its own MOOC platform, and other projects have since been developed for implementing MOOCs in other parts of the globe, such as Mirada in Spain, Diversity in Germany, Future Learn in the United Kingdom, Open2Study in Australia, and FUN in France.
MOOC CHALLENGES AND EDS TECHNIQUES
MOOCs have experienced a spectacular data science increase in popularity over the past few years and are celebrated by some as threatening the current pedagogy and practices in the education industry, while others are much more pessimistic about their influence. Indeed, despite the overall expediency, there is much skepticism because of concerns or issues raised by several research studies, as we explain below.
Siemens identified four issues of MOOCs in their first years. MOOCs suffer from low completion rates compared to standard university courses. MOOCs currently lack a profitable revenue model. Although MOOCs tend to be non-credit, cheating and plagiarism are becoming issues for university providers. In addition, google data analytics, data science, data analyst course, google course, data analytics there is a threat to deskilling the professoriate because of the effects of super professors at leading universities offering recorded lectures to other universities.
Later work identified the following challenges and trends of MOOCs through blog mining: the lack of interaction between the MOOC instructor and learners will damage the course quality. MOOCs have substantially higher dropout rates than those seen in traditional education. Few colleges or universities offer full course credit to students who complete an MOOC. Conducting effective assessment in an MOOC has so far been a major challenge. Copyrights of a MOOC are complex, and it is not apparent who has the copyright for a MOOC.
Analysis of Students’ Interactions
Students’ MOOC interaction data science analysis can offer valuable information to instructors, material creators, and organization members attempting to enhance their MOOCs by demonstrating critical issues in the course. A typical MOOC has multiple sources of data on student interactions, including usage and engagement, video lectures, and social networking communication via forums. Each click, each page or slide read or viewed, each submission, each video player action, each test or question responded to, and each social interaction within a forum can create a digital footprint. However, it is difficult to manually analyze the huge amount of student interaction data gathered over various MOOCs. It is, therefore, necessary to use EDS techniques.
Analyzing Usage and Engagement
One of the most prevalent and initial EDS uses in MOOCs is the data science analysis of the usage and engagement to generate generalized information about each student’s interaction with the platform. One of the quickest approaches to enabling instructors and other stakeholders to understand MOOC data is to graphically represent collected data on student use and engagement using plots, bar charts, iterative analysis, and state transition diagrams.
This allows one to conveniently determine the weaknesses of their courses. Various forms of engagement in MOOCs, namely active participants, lurkers, and passive participants, have been identified, as well as determinants influencing engagement, for example, confidence, experience and motivation. These findings can be applied to the designers of future MOOCs to help customize the learning experience to cater to the different types of learners that might want to learn through this method. The learner engagement has been classified as active, passive, and disengaged using a data-driven approach with Probabilistic Soft Logic to represent the latent engagement types. They feel that knowing student engagement during a study is necessary to reduce the dropout rate.
Predicting Students at Risk of Dropping Out
MOOCs have low completion rates compared to conventional university data science courses, and this high rate of learner dropout has been cited to raise questions about their bright future. The funnel of participation metaphor illustrates the sharp drop-off in activity and the pattern of sharply unequal participation, which seem to be typical of MOOCs. There are several potential explanations for these low MOOC retention rates, including limited time, motivation learners, isolation, insufficient interactivity, inadequate background knowledge and skills, and not immediately obvious costs.
Various EDS methods, primarily correlation, regression, and classification, have been utilized to forecast at-risk students to drop out. Pearson’s correlation between student attributes and the eventual performance of students has been suggested to extend the variables corresponding to student dropout in MOOCs to more specific temporal granularities to enable more precise predictions. Their findings indicate that the use of finer-grained temporal details improves prediction power in the initial stages. Logistic regression has been used to forecast the last performance in the course based on a blend of students’ performance on weekly assignments and social engagement in the MOOC.
The heterogeneity among MOOC students is high, owing to the immense number of participants at the same time with varied educational and cultural backgrounds whose experience and maturity levels can also differ. Participants can experience frustration arising from a lack of to the individual needs and learning styles of each. One solution to this issue is to employ adaptive learning systems, which give learners a feeling of personalization by using a given learner model and recommender systems to suggest recommendations to one student or more students.
CONCLUSIONS
MOOCs were hailed as the perfect research vehicle for data science education their huge sample sizes and the possibility of monitoring in detail the students’ interaction across the course present unmatched opportunities for conducting learning experiments. This treasure trove has led to a considerable increase in research studies and the use of EDS techniques in the data collected on such platforms.
In this paper, we have listed and described four of the existing issues or problems in MOOCs that have been addressed through EDS techniques. We have chosen the key works for each problem up to 2015. Yet, we consider that other critical issues will be of huge interest in the future of using EDS in MOOCs. The size of data provided in the case of truly enormous MOOCs can be so enormous that standard DM methods are not appropriate, and Big Data methods need to be employed. The abbreviation ‘BD’ is frequently used to refer to data sets that have expanded to many orders of magnitude larger than exabytes and zettabytes.
These data sets come to a level where the possibility of capturing, handling, and processing such things within a practical time cannot be realized using generally available software tools. In addition, these data sets can be so large that they cannot be stored in the main memory, and the performance of certain DM algorithms may be compromised when large data science analytics are involved. Everything here has motivated and driven the BD research that is today being utilized in other applications under other such established frameworks successfully.