Factor Structure and Dimensionality of an Instrument designed to Measure the Metacognitive Orientation of Thai Science Classroom Learning Environments

: The purpose of this study was to establish the factor structure and dimensionality of the Metacognitive Orientation Learning Environment Scale – Science (MOLES-S) in the Thai context. The metacognitive orientation of a science classroom learning environment is defined as the extent to which psychosocial conditions that are known to enhance students’ metacognition are evident in a specific science classroom. This study builds on earlier work in the research areas of science education, metacognition, and learning environments. A sample of 5418 Thai science students in grades 10 to 12, from 40 schools across Thailand, completed the MOLES-S that had been translated into Thai. Exploratory factor analysis was undertaken and Rasch analysis was used to calibrate the scale and explore its dimensionality. The results suggest that the MOLES-S(T), where (T) represents Thailand, has the same factor structure as the original MOLES-S, is reliable, and can be used with confidence in research into metacognition in Thai high school science classrooms


Introduction
The development and enhancement of students' metacognition is well established as an educational goal (e.g., Donovan & Bransford, 2005;Sternberg, 1998;Thomas, 2012). While an exact definition of metacognition continues to be somewhat problematic for scholars (e.g., Hsu et al., 2016;Scott & Levy, 2013), the position taken in this paper is well established in the literature; metacognition refers to an individual's knowledge, control and awareness of their thinking and learning processes (Flavell, 1979;Garner & Alexander, 1989;Thomas, 2012). Developing and enhancing students' metacognition can help improve their learning of science (e.g., Georghiades, 2006;Zhao et al., 2014) and other subjects (Hattie, 2014;Ohtani & Hisasaka, 2018). Therefore, developing interventions for attending to its development in everyday science classrooms is important.
The Metacognitive Orientation Learning Environment Scale -Science (MOLES-S) was developed by Thomas (2003Thomas ( , 2004. Its purpose is to enable researchers and teachers to establish quantitively the extent to which specific psychosocial factors, known to be important for establishing a classroom environment that is conducive to the development and enhancement of students' metacognition (Table 1), are evident or otherwise in any science classroom. Its development was informed by the processes used to develop other classroom learning environment instruments such as the Constructivist Learning Environment Survey (CLES) (Taylor et al., 1994) and the Individualised Classroom Environment Questionnaire (ICEQ) (Fraser, 1990). The origins of the field of classroom learning environments can be tracked to the writings of Kurt Lewin (1936) who proposed that an individual's behaviour is determined by an individual's environment and its interactions with that individual's personal characteristics. Lewin's work was followed by that of Murray (1938) who offered a Needs-Press Model. This model suggested a degree of variation in an individual's behaviour could be accounted for by situational variables in the individual's environment. Murray also coined the terms 'beta press' and 'alpha press.' Beta press refers to the description of an environment as perceived by those within that environment, e.g., students and teachers. Alpha press refers to the description of an environment as perceived by a detached observer, e.g., a researcher conducting classroom observations. The MOLES-S seeks Beta press reports from students in relation to the extent to which the factors summarized in Table 1 are evident to them in their science classrooms. These student perceptions are important for establishing the ecological validity of classroom studies, especially those involving pedagogical interventions. For research to be ecologically valid, the research participants, e.g., students, should perceive their learning environment, happenings within their environment, and their behaviours and thinking to be consistent with those of the researcher (Kihlstrom, 2021). When there are discrepancies between the beta press and alpha press reports, the ecological validity of the research could be called into question.
The MOLES-S was developed originally in English and Chinese languages. It is an instrument that uses a Likert scale from 1 to 5 for students to report their perceptions of their classroom environment in relation to the scales in Table 1. These scales represent dimensions of metacognitively oriented science classroom learning environments. Since its development, it has been used in studies in, for example, the Philippines (Sagun & Prudente, 2021), Canada (Thomas, 2013(Thomas, , 2017, Turkey (Şahin, 2015), and the United States of America (Peters & Kitsantas, 2010). It has also been used in limited research in Thailand (Chantharanuwong et al., 2012) although in the Thai study, like those aforementioned, statistical analysis on its factor structure and dimensionality were not reported. This study explores the potential for the future use of the MOLES-S in research in Thai science classrooms. Research into metacognition in Thai science education is developing (e.g., Chantharanuwong et al., 2011Chantharanuwong et al., , 2014Chantharanuwong et al., , 2016Pimvichai et al., 2019). Designing pedagogical interventions to develop and enhance students' metacognition that reflect the dimensions of the MOLES-S might lead to changes in classroom practices that result in improvements in science learning that ameliorate Thailand's educational performance as reported in international studies such as PISA (Organisation for Economic Co-operation andDevelopment, 2019a, 2019b). This paper reports on the extent to which the 7 factors of the MOLES-S are endorsed by students in Thai science classrooms, and whether the sub-scales of the MOLES-S can be calibrated as a unidimensional scale for research in Thailand. Our ultimate aim was to validate a Thai version of the MOLES-S, which we will refer to from this point as the MOLES-S(T) where (T) stands for Thailand. Students' ideas are respected.

Instrument Design and Field Testing
The MOLES-S developed by Thomas (2003) originally in English, and was translated into Chinese for its initial validation and statistical analysis. In this study, conducted in Thailand, it was necessary to develop a version in the Thai language. Consequently, translations and back translations, as described by Brislin (1980) and Behr and Shishido (2016) took place. These activities involved the authors and bilingual university academics with expertise in psychology, scale construction, and metacognition. Feedback on the face validity and conceptual congruence of the MOLES-S(T) with the original MOLES-S assisted with finalizing the version of the MOLES-S(T), as shown in Appendix A, that was used in this study.
The 5418 student participants were drawn from 40 schools across four regions of Thailand. This sample is more than sufficient for field testing a learning environments instrument. The students came from 162 classes within those schools. Table 2 provides details of the locations of the schools, the number of classes of each grade sampled from each region, and the number of student participants from each region. The data were collected as hard copies by the second author with assistance from teachers in the 40 schools. The raw data were entered into SPSS Version 28.0.0.0. Only those questionnaires that were completed in full by students (all items answered) were entered into the database.

Analyzing of Data
The analytic procedures employed and reported in this paper are mainstream and accepted in the developing survey instruments in the field of learning environments (see, for example, Fraser et al., 1995;Schultz-Jones & Ledbetter, 2013;Thomas, 2003Thomas, , 2004Thomas et al., 2013;Ward & Fisher, 2013). The data were analyzed using principal components factor analysis followed by varimax rotation (see, for example, Jolliffe, 2002;Paz, 2008;Reise et al., 2000) using SPSS Version 28.0.0.0. Only items with factor loadings greater than .40 were retained for each factor as this cutoff "seems to be preferred by many researchers" (Salkind, 2010, p. 482), including those working in the field of learning environments. The estimation of the internal consistency was explored by calculating Cronbach alpha coefficients (see, for example, Santos, 1999;Tavakol & Dennick, 2011). The discriminant validity, representing the extent to which the dimensions represented by the sub-scales overlap, was computed using the mean correlation of a sub-scale with the other 6 sub-scales as a convenient index.
Following the exploratory factor analysis, The MOLES-S(T), was calibrated using Rasch scale modeling (Wright & Masters, 1982). The responses for all 35 items from the MOLES-S(T) were calibrated together on a common scale using WINSTEPS (Linacre, 2021;Linacre & Wright, 1999). The Rasch model specifies the form of the relations between persons and items on a questionnaire such as the MOLES-S(T) to operationalize the metacognitive orientation of Thai science classrooms. The possibility of higher composite scores on the MOLES-S(T) would increase as students perceive the classroom environment to be more metacognitively oriented. Conversely, scores on the MOLES-S(T) would decrease if students do not perceive their classrooms to be metacognitively oriented. In the later instance the items would be harder for students to endorse.
Rasch modeling provides several metrics that are useful for scale construction and development. Person and Item separation and reliability of separation are measures of the MOLES-S(T)'s spread across what can be termed the metacognitive orientation continuum (Thomas, 2004). In Rasch analysis, the Person Separation indices are analogous to the reliability analyses of traditional test theory. Reliability is considered "a property of the sample being measured by the scale, as well as a measure of the scale being gauged by the sample" (Mok & Flynn, 2002, p. 23). Item Separation indices indicate whether a scale's items are able to "define a line of increasing intensity in relation to the degree to which they are separated along that line (Thomas, 2004, p. 375). Item and Person separation indices are reported in this paper and values acceptable for both are .70. Values higher than .70 suggest higher Rasch reliability. Also, we compute H1, to indicate the number of item strata defined by the scale, as suggested by Wright and Masters (1982). The calculation for H1 is H1 = (4G1 + 1)/3 where G1 is the Item Separation Index. The item difficulty reveals whether the item was difficult or easy for the students to endorse. Higher item difficulties (logits of greater value) suggest that students would find these items harder to endorse, i.e., they score these items lower.
Outfit MNSQs are indicators of the extent to which each item fits a Rasch rating scale (Smith, 1999). Outfit MNSQ values between 0.5 and 1.5 suggest that an item reasonably fits a Rasch rating scale, and values outside these suggest the item might not fit the Rasch model. High point biserial correlations for a set of items, those above .20 as a general rule, suggest that the items are good indicators of a unified construct.

Findings / Results
Statistical information in the form of internal consistency, each sub-scale's item mean, standard deviation, and discriminant validity is provided in Table 3. The Cronbach alphas range from .75 to .89. These figures suggest that there is an acceptable level of internal consistency among the items for each of the scales. However, as will be discussed later and in relation to the Rasch analysis, the high Cronbach alpha figures for some scales might imply the redundancy of some items in these scales. (Taber, 2018). The discriminant validity values for each scale (using the mean correlation of a scale with the other six scales as a convenient index) suggest that, while there is some overlap between the dimensions, the measure distinct aspects of the Thai science classrooms' psychosocial environments.  .000 Table 5 shows the factor loadings of the items in the MOLES-S(T). Each of the 35 items had a factor loading of greater than .40 with its own scale and less than .40 with other scales, therefore providing support for the factorial validity of the MOLES-S(T). These distinct factors and the items they consist of mirror those reported by Thomas (2003) who also used .40 as the cut-off figure for determining whether an item loaded onto one factor or another, and to what extent.  To report the findings of the Rasch analysis several values are used. Table 6 reports the item difficulty, the outweighted mean squares (Outfit MNSQ) and infit mean squares (Infit MNSQ) and point biserial correlations for the MOLES-S(T) items. Mean infit and outfit should be as close as possible to 1. For the Person mean squares the outfit and infit are both 1.00. For the Item mean squares the outfit and infit were also both 1.0. Ideally the mean standardized infit and outfit should be 0.00. In this case they are -03 and -.03 for Persons and -0.8 and -0.7 for Items. Therefore, on average, the items overfit, suggesting that the data fit the Rasch model better than expected. Such overfit can suggest some redundancy and the potential to further reduce the number of items in the instrument while not necessarily reducing its diagnostic value. This possibility is discussed below in relation to relevant items.
An indication of the extent of the overall misfit is the standard deviation of the standardized infit (Bode & Wright, 1999). Using 2.00 as an acceptable cut-off criterion, the standardized infit SD for persons is 0.54 and for items is 0.18. Therefore, the data suggest acceptable overall fit.  Table 7 shows the spread of the scale items over the expectation continuum. The items are located according to the scale they are members of. The items spread out from -0.82 (Item 35) to 0.91 (Item 22) logits over the expectation continuum. The item separation index is 29.48 and there are approximately 39.64 item strata. Therefore, it is proposed that there is reasonable item separation on the metacognitive orientation of Thai science classrooms continuum. Closer inspection of Figure 1, however, reveals that the items for some scales spread along the continuum more than items for other scales. This is evident for the Student Voice and Metacognitive Demands scales. Also, The Distributed Control and Emotional Support Scales do not overlap with any of the other scales, suggesting they could be very distinct scales within the MOLES-S(T). It is also worth reporting that there is substantial clumping of the items in some dimensions, particularly Student-Teacher discourse and Emotional Support. This was also reported by Thomas (2003). Therefore, while there is a reasonable item separation and spread of items across the continuum, it would be preferable to have less clustering for the items for some dimensions, and more overlap between the items of some dimensions. The person mean measure (Table 6) is 0.34 logits which suggests that, if anything, the MOLES-S(T) items were difficult for the students to endorse, but that they were still fairly well matched to the sample of the 5418 students. Finally, it is noteworthy that the value of difference between the item and student means is not greater than positive one or less than negative one. Had this been so, it would have meant that some items were potentially too easy or too difficult to endorse. The implication of such a finding would be that there would be a need to undertake a major revision of the items. Table 8 presents further results of the Rasch analysis, including item difficulties and point biserial correlations for the MOLES-S(T) items. The items are shown in their expectation dimensions and ordered with respect to descending order of difficulty to assist reader interpretation. The results suggest that, overall, the MOLES-S(T) is quite reliable in Rasch terms. The Real Item Reliability is 1.00, suggesting high internal consistency, and the Real Student reliability is 0.92. The Person Separation Index is 3.51 which is above the 0.7 criterion threshold. The Outfit MNSQ values are between 0.73 (Item 12) and 1.33 (Item 19). No item has an Outfit MNSQ outside the accepted range of 0.5 to 1.5. Therefore, the items fit well overall to the RASCH Scale Model. Finally, the generally high point biserial correlations suggest that all of the MOLES-S(T) items are good indictors of a unified construct.

Discussion
As noted above, the development and enhancement of science students' metacognition is an educational priority worldwide. In Thailand increased interest in metacognition has been evident in the last decade, and this interest is sure to continue. A general interpretation of the findings from both the factor and Rasch analyses is that the Thai students found the Emotional Support and Encouragement and Support items easy to endorse, and the Distributed Control items difficult to endorse. This finding is consistent with that of Thomas (2003). Thomas proposed that the notion of distributed control is a "novel concept for both students and teachers" based on a shared understanding and expectation "of a traditional science classroom that is dominated by teacher talk and passive student compliance" (p. 380). In relation to Emotional support, the 1026 students from Hong Kong in Thomas's (2003) study and the 5418 Thai students in this study both found the items in that scale easiest to endorse. This may be because the matters these items attend to are commonly reflected upon by students in both contexts. Further research on this matter would be necessary to support such a notion.
It is worth noting that the item-factor relations of this instrument in Thai strongly replicate the findings of the initial development of the MOLES-S by Thomas (2003). All 35 items of the original MOLES-S loaded onto the same factors with loadings of .40 using the data collected from the Thai students. In other words, the same factors and loadings are evident. However, this replication of items, factors, and loadings is not found across other adaptations of the MOLES-S. For example, in the aforementioned study by Şahin (2015) the researcher used a version of the MOLES-S validated by for use in Turkish classrooms by Yildiz and Ergin (2007). That version of the MOLES-S contains 21 items and five subscale dimensions: Emotional Support, Distributed Control, Student-Student Discourse, Student Voice, and Metacognitive Demands. The sub-scales for Encouragement and Support and Student-Teacher Discourse are omitted from the Turkish version. Further, only items 1 and 4 of the original MOLES-S (Thomas, 2003) constitute the Metacognitive Demands sub-scale. Şahin used confirmatory factor analysis to confirm the factor structure of the 21item instrument for her study. This use of a modified version of the MOLES-S in Turkey, along with other aforementioned applications in Hong Kong, Canada, the USA, the Philippines, and Thailand, suggests that at least five of the dimensions of a metacognitively oriented science classroom, as proposed by Thomas (2003), are viable and valid for use across numerous linguistic and cultural settings. The MOLES-S(T) can be used to explore specific dimensions of the metacognitive orientation of an individual science classroom's or science classrooms' learning environment/s in Thailand. In such use, individual scales could be focused on for research and pedagogical intervention. The MOLES-S(T) can also be used to ascertain measures of the overall metacognitive orientation of science classroom learning environments.
Future research possibilities still exist in relation to the MOLES-S. As with other learning environment instruments, each factor provides an overview of a particular dimension, supported by literature, that can be used to frame and report on life in classrooms and the psychosocial factors within them. However, the dimensions of the MOLES-S, by themselves are quite broad and do not provide substantial detail regarding the specifics of classroom life. For example, in the metacognitive demands dimension there is the item, [In this science classroom] Students are asked by the teacher to try new ways of learning science" (Thomas, 2003, p. 191). This item is general in its orientation and does not ask anything specifically about what those new ways might be or are. It would be possible to take each of the dimensions (sub-scales) of the MOLES-S and expand the number of items in them to seek more specific information to inform researchers about classroom life and specific pedagogies in more detail. Future research could explore this possibility. The alternative to that is using time consuming mixed-method approaches, such as that employed by Thomas (2013Thomas ( , 2017, to seek to clarify what 'new ways of learning science' students are referring to when they respond to that item in relation to their teacher. Further, Thomas (2003) reported that his proposed sub-scale 'Teacher modeling and explanation' did not survive factor analysis in the initial development of MOLES-S. However, the findings from Thomas and Anderson (2014) and Thomas (2013Thomas ( , 2017 clearly suggest that when teachers model and explain cognitive and learning processes and strategies to high school science students that, (a) the metacognitive orientation of the classroom improves, and (b) that students begin to think and learn differently and in a more conscious manner than they did prior to such teacher pedagogy. It would be informative for future research to revisit the 'Teacher modeling and explanation' construct and consider how this construct could be operationalized in the MOLES-S.

Conclusion
The findings from the Rasch analysis suggest the MOLES-S(T) can be considered a unidimensional scale that gives a raw score out of 175 (Seven scales, each with a maximum possible score of 25). Therefore, data from using the MOLES-S(T) can be used to provide both (a) information that focuses on separate elements of the metacognitive orientation of a science classroom learning environment, and (b) a score that provides an overall summative measure of a science classroom's metacognitive orientation.

Recommendations
With this validation of the MOLES-S(T), there is now the possibility that the following types of research in science education in Thailand could benefit from its use. Firstly, the MOLES-S(T) can be used to collect baseline data from students on the metacognitive orientation of their science classroom learning environments. Such data can provide a snapshot, a time-stamp, of a particular classroom or set of classrooms. It can also be used to compare contexts. For example, it is worth noting that the students in Thomas's (2003) study and the students in this study both found the items in the Distributed Control scale hardest to endorse and scored it the lowest of all the scales. Students in both studies also found the items on the Emotional Support scale easiest to endorse. These findings suggest that, in both cultural contexts, the students do not perceive that they have much control over the activities they do in class, and that they both perceive reasonable levels of emotional support from their teachers. Having learning environment instruments such as the MOLES(S) in multiple languages might enable in-depth cross-cultural research to be undertaken where these findings are explored more deeply.
Secondly, the data form the MOLES-S(T) can be used as part of a battery of measures to explore the effect of teaching interventions in classrooms that aim to develop and enhance Thai science students' metacognition. Examples of such work are those of Thomas in Canada (2013, 2017. To develop and enhance students' metacognition requires changes in the learning environment, most often initiated by teachers. Therefore, the MOLES-S(T) can be used to target specific elements of the classroom learning environment, e.g., Metacognitive Demands, plan changes to pedagogy that seek to alter or increase the target element, and to establish whether students perceive the proposed changes to be evident in their classrooms. Other measures that could be used to triangulate the data could include interviews with students and teachers, and classroom observations and field notes. Having multiple sources of data can increase the dependability of assertions in research when the findings from each data set suggest the same or similar view of a phenomenon.
Thirdly, Thai teachers could use the MOLES-S(T) when conducting action research with their own students. Using the MOLES-S(T) they could establish what the students' perceptions of their learning environments are and plan changes to their pedagogy that target elements of those environments that they choose to target. Seeking feedback from students can enable them to establish what students' views are of their pedagogy, enabling them to respond to and manage students' needs.

Limitations
One should interpret statistical data from the MOLES-S with the following provisos. Firstly, because the MOLES-S is a quantitative instrument it, as mentioned previously, does not necessarily specify the nature or quality of what it can measure. For example, in this study the mean value for the Metacognitive Demands scale was 17.29. This represents an average response of between 3 (Sometimes) and 4 (Often) from students. This is a potentially encouraging figure for it suggests that, on average, the teachers of the Thai students in this sample perceived that their teachers did make metacognitive demands on them. However, the precise details of what these demands were cannot alone be ascertained from the number. This is why, as previously mentioned, (a) the MOLES-S may be most useful in mixed methods studies where a range of data sources are engaged, and (b) expansion of items for each sub-scale could be considered. Also, as noted above, there is a need to examine again the possible redundancy of some items in the MOLES-S(T), especially in the Student-Teacher Discourse and Emotional Support Scales. Further research will be necessary to ascertain if trimming of any items is necessary.