Abstract:Group tendency is a research branch of computer assisted learning. The construction of good learning behavior is of great significance to learners' learning process and learning effect, and is the key basis of data-driven education decision-making. Clustering analysis is an effective method for the study of group tendency. Therefore, it is necessary to obtain the online learning behavior big data set of multi period and multi course, and describe the learning behavior as multi-dimensional learning interaction activities. First of all, on the basis of data initialization and standardization, we locate the classification conditions of data, realize the differentiation and integration of learning behavior, and form multiple subsets of data to be clustered; secondly, according to the topological relevance and dependence between learning interaction activities, we design an improved algorithm of BIRCH clustering based on random walking strategy, which realizes the retrieval evaluation and data of key learning interaction activities; Thirdly, through the calculation and comparison of several performance indexes, the improved algorithm has obvious advantages in learning interactive activity clustering, and the clustering process and results are feasible and reliable. The conclusion of this study can be used for reference and can be popularized. It has practical significance for the research of education big data and the practical application of learning analytics.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to solve the problem of cluster analysis of online learning interaction activities. Specifically, the author hopes to conduct effective cluster analysis on large - scale, multi - course, and multi - cycle online learning behavior data through an improved BIRCH algorithm combined with a random walk strategy. The following are the specific goals and problems of this research:
1. **Constructing good learning behaviors**:
- The research points out that constructing good learning behaviors is crucial for the learning process and outcomes of learners and is the basis for data - driven educational decision - making.
2. **Processing complex learning interaction activity data**:
- The data generated by online learning interaction activities is massive, discrete, and autonomous, with sparse information and difficulty in forming a continuous and complete description. It is difficult to directly mine useful values from the original data. Therefore, it is necessary to extract useful information and obtain knowledge or wisdom by mining the internal relationships of learning interaction activities.
3. **Limitations of traditional clustering algorithms**:
- Traditional clustering algorithms (such as K - means, hierarchical clustering, etc.) have limitations when processing learning behavior data, for example, they cannot fully analyze and compare data and their relationships. These algorithms show great limitations in calculating and constructing learning interaction activities.
4. **Proposing an improved BIRCH algorithm**:
- To solve the above problems, the author proposes an improved BIRCH clustering algorithm based on a random walk strategy. This algorithm locates the key paths of learning interaction activities through a random walk strategy, reduces the data boundaries of cluster analysis, and improves the accuracy and efficiency of clustering.
5. **Verifying the effectiveness of the algorithm**:
- By calculating and comparing multiple performance indicators, verify the advantages of the improved BIRCH algorithm in the clustering of learning interaction activities to ensure the feasibility and reliability of the clustering process and results.
### Formulas and methods
- **Random walk model**:
\[
\text{Random walk}: \text{Let graph } G=(V, E), v_0\in V \text{ be the starting point, randomly select } v_i \text{ as the walking target, and then select } v_j \text{ as the target for the next step}
\]
- **Markov chain**:
\[
P(X_{n + 1}=j|X_n = i, X_{n-1}=x_{n-1},\dots, X_0 = x_0)=P(X_{n + 1}=j|X_n = i)
\]
- **CF - tree structure**:
\[
\text{CF node}=(N, LS, SS), \text{where } N \text{ is the number of samples, } LS \text{ is the sum vector of feature dimensions, and } SS \text{ is the sum of squares of feature dimensions}
\]
### Summary
The main contribution of this paper is to propose an improved BIRCH clustering algorithm based on a random walk strategy for effective cluster analysis of online learning interaction activities. This not only helps to understand the interaction behavior patterns of learners but also provides support for personalized learning and teaching practices.