A Measurement Tool

In a previous post I made the assertion that “Science can heal the Montessori movement if we are willing to do the research.”

After such a bold statement it would be reasonable to ask “Well then, what research would you propose?”

I am interested in comparing the efficacy of different methods using empirical evidence (not opinion or experience or anecdotal evidence.)  I want to compare specific practices (such as three period lesson with 3 objects vs 2 objects) and to explore holistic comparisons (AMI vs AMS vs other trainings).  I sincerely believe that research is the most effective way to establish best practices that are supported by empirical evidence and unify the Montessori movement.

In a college Social Psychology class, almost 20 years ago, I learned the basics of designing and conducting a research study from Dr. Laura Sinnet.  I am not a professional researcher, but we don’t have to be professional researchers to conduct experiments, we just have to be passionate and curious and willing to learn.  Amateur researchers are better than no researchers.

One of the lessons I learned from Dr Sinnet was the design of a Nomological Network, “a representation of the concepts (constructs) of interest in a study, their observable manifestations, and the interrelationships among and between these.”  

Say, for example, that we want to conduct an experiment about meditation and relaxation.  Meditation and relaxation are intellectual concepts (or constructs) and are therefore a little vague and subjective.  What kind of meditation?  Transcendental? Yoga? Zen Buddhist? Tibetan Buddhist? Guided visualization?  What does “relaxed” mean?  How do we start with vague and subjective constructs and move to a real world experiment?

We construct a Nomological Network that defines the relationship between the constructs (Hypothesis: Meditation indicesin relaxation).  Then we define the observable manifestations of those constructs.  In this example we would clearly define the methodology of the meditation we are testing and other important variables (subject selection, previous meditation experience of subjects, etc…) as well as the method of measuring relaxation (resting heart rate, muscle tension, respiration rate, etc…).

Measurement Tool

The Nomological Network for my question would look something like this…

Measurement Tool (1)
When I consider my Nomological Network it is obvious to me that an important component is missing.  We do not have a tool to reliably measure the observable phenomena of Normalization.   

Trained Montessori Guides have been presented with a definition of normalization and seen normalized children in observations and practical experience. However, these experiences result in subjective, individualized definitions and scales of normalization.  What is highly normalized in my experience might be barely normalized in the experience of another guide.  If five Guides observe the same environment we might get 5 different ratings of normalization, and that is not a reliable measure.

“In the psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. For example, measurements of people’s height and weight are often extremely reliable.”


We need a tool or methodology for measuring Normalization that is valid and highly reliable.  Such a tool will allow for experiments that can be repeated to confirm similar results (a hallmark of the scientific method) as well as large scale experiments that combine the observations and measurements of many individuals.  (example: comparing normalization levels in 500 AMI environments and 500 AMS environments.)

As I stated earlier, my interest is in comparing the efficacy of different methods using empirical evidence (not opinion or experience or anecdotal evidence), however, since this research requires a valid and reliable tool to measure Normalization I would be forced to begin by researching and developing a measure that is tested and proven reliable.

How do we develop a Normalization Measure?  Initially I am curious if there might already be informal methods of measuring Normalization that could be assessed and formalized into a Measurement Tool. If there are experts who are able to reliably measure Normalization then we might be able to codify and define these techniques for use as a Normalization Measure that anyone can be trained to use.

Beginning with the assumption that individuals with greater experience and extensive study of the Montessori method would be most accurate in their measure of Normalization,  I would begin my research by testing a sample of individuals who meet these criteria.  As an AMI trained Guide I immediately consider AMI Trainers because all AMI trainers have a minimum of 5 years of experience in a working environment and have completed rigorous academic preparations.  

I would start by gathering video of 11 Primary environments from conditions assumed to result in varying levels of normalization.  For example: an environment in a well established school with the same Guide for the past 10 years would be assumed to be very Normalized, while a new environment in a new school with a recent graduate would be assumed to be not at all Normalized.  

I would then invite as many AMI Primary Trainers as possible to participate in the research.  The experimental method would look something like this…

  • A website is constructed to conduct the research online.  
  • An initial questionnaire will gather participant demographic information including name, gender, age, levels of training completed, years of experience before becoming a trainer, number of courses completed as a trainer.
  • Researcher text or video explains that the participant is to watch each video and rate the level of Normalization observed in the environment.  The ratings would be a predetermined Likert scale with 5-7 descriptors.  The participant is asked to narrate their thought process including specific observations while watching the videos.  Participants may stop the video when they are confident in their rating of the environment and proceed to the next video.
  • A single video will be selected as the first video seen by all participants as a practice/baseline video.  Subsequent videos will be viewed in a random order.  
  • A screen capture program is used to record the participant’s voice and cursor movements while they narrate their thought process and complete ratings.

Analyzing the results

  • The ratings would be analyzed for inter-rater reliability to determine if there is consensus among the AMI trainers.  
    • If there is a high degree of inter-rater reliability the next step is to examine the narration videos and look for patterns and similar approaches to rating Normalization.  These would become the basis for a Normalization Measure.  It would also be interesting to compare demographic information for factors that significantly influence inter-rater reliability.
    • If there is not a high degree of inter-rater reliability then we abandon the assumption that individuals with greater experience and extensive study of the Montessori method would be most accurate in their measure of Normalization.  An entirely new approach would be needed.
  • We would also analyze the time used by each participant to determine the Normalization rating.  This may provide some insight into the amount of observation time required to rate Normalization.

Future experiments

  • If a high degree of inter-rater reliability is found among AMI trainers it would be interesting to repeat the experiment with trainers from other organizations.  Would inter-rater reliability be similar among trainers from different trainings?  How would the inter-rater reliability change if all participants are considered as a single sample?  Is the variation in ratings greater within a training than between trainings?
  • If a Normalization Measure is created using the narration videos the reliability of that measure would be tested with new participants.  We would develop a training method and select samples for study (current AMI students, recent graduates, Guides with 5-10 years of experience, Guides with 10+ years of experience, Assistants, Administrators, Parents, individuals not associated with the Montessori community).  It is possible that additional videos would be required for this process.  We would compare ratings for inter-rater reliability, and adjust the training method until a high degree of reliability is achieved.  When a high degree of reliability is achieved with videos we would test the ratings in actual environments to confirm that the Normalization Measure continues to be reliable in the field.

Now you know the research I personally would propose.  Next we can ask, “How can Montessori research be funded?”


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s