We propose to start this module by reflecting on the following research question: “Which factors influence the performance of high school students on their graduation exam?” Each of you have some thoughts about the factors based on your personal experience as a former high school student. Thus, you might mention such factors as: gender, educational achievement, family background, residence, health status, risky behaviour, etc. The impact of these factors on school success are well documented in scientific literature in the field of education. In 2015, we identified a research gap while investigating this question. Considering that during adolescence, youth engage in romantic and intimate relationships, we nevertheless found no previous study looking at the influence of this factor on the school achievement (Dégi and Faludi, 2015). Now let’s say that in approaching this research question you decided to collect data through a sociological survey, using a self-administered online questionnaire applied to high school students from your locality. When you design the questionnaire, you must answer the crucial question: “How will I measure most appropriately the aforementioned factors and the outcomes influenced by these factors?”
Before proceeding to the effective measurement, let’s introduce the term “variable”, which is a trait, a characteristic, or an attribute possessed by the statistical individual, which presents variations in the studied sample or population, and it is of interest for the researcher. In our example, the statistical individual is the high-school student, and its characteristics are variables, as we know that there is variation among high school students in terms of their school grades, family background, health status, etc.
The process of measurement means to establish for each variable involved in the research its exact meaning, and this procedure is called conceptualization. In the mentioned study (Dégi and Faludi, 2015), the authors agreed on the following specifications for some of the investigated variables: for educational achievement – whether the student had to re-take any of the examinations in the last semester of the 12th grade; for family background – whether the family of origin had a low socio-economic status, for residence settlement – place of residence at present, for health status – self-rated health, for risky behaviours – consumption of illegal substances, for intimate relationship – type of relationship with the partner at the time of the first sexual intercourse. Also, the conceptualization of the performance during the graduation exam assumes an exact description of this form of evaluation. In Romania, where the study was conducted, this exam is called Baccalaureate, a compulsory nationwide examination held when students graduate from secondary education, called Liceu (the equivalent of high school), after 12 years of studies in total. Baccalaureate is a single degree qualification test based on 10-point scale grading system (with a pass mark of 6), and it is compulsory for the admission exam at the university.
The next step of the measurement process is to operationalize each variable. Operationalization means to settle the precise procedures used to measure the attributes of each variable. In other words, operationalization involves specifying the range of variation of variables, deciding on the appropriate level of measurement--an aspect which will be detailed in the next section of the module. We continue using the same example: the operationalization of investigated variables was made to respond to the rigors of the multivariate analysis. Thus, for the variables 'whether the student had to re-take any of the examinations in the last semester of the 12th grade', and for 'whether the family of origin had a low socio-economic status', and for 'the consumption of illegal substances' – the variation of answers was reflected by dichotomic answer of no or yes; for gender – the answers were female and male; for place of residence at present the choices were urban and rural; for self-rated health the answers varied from very good, fair and poor; and for the relationship with the partner at the time of the first sexual intercourse the authors distinguished between steady versus occasional relationship. Of course, the conceptualization implies that the researchers specify exactly what a steady relationship means. For instance, in this case, an intimate relationship was defined as steady if the two partners were together at least 3 months before the first intercourse. Performance at the Baccalaureate exam was conceptualized using two variables: taking the Baccalaureate examination (yes versus no) and the results obtained on the Baccalaureate examination (good versus bad result, established as acquiring an equal to the average grade point and over, versus under the average grade point).
Here we must underlie three points:
We will continue to advance in our module by explaining two other concepts – independent and dependent variables, while invoking the same study. You recall we mentioned the gap in the research the authors identified: The authors assumed that there might be an association between the context and the performance at the Baccalaureate. In the methodology of quantitative research, we call this statement a research hypothesis. Here, and in other studies, the researchers are interested in seeking a relationship between two or more variables, one of them called dependent variable, and the other/s named independent variable/s. The dependent variable is the one we wish to explain in our research study, whose variations represents the effects of the independent variable/s, referred to as factors. In this example, the graduation at the Baccalaureate and the result at this examination is the dependent variable, and these may depend, among other factors, on the type of relationship with the intimate partner, which is the independent variable. The results of the multivariate analysis not only confirmed this hypothesis, but demonstrated that the type of relationship with the partner at first sexual intercourse was the most important factor on school performance. To be clearer, it has been proven that high school students who began their sexual life with an occasional partner were almost three times more likely to fail at Baccalaureate and more than five times more likely to obtain poor results at this examination. Although the questionnaire was completed by a relatively small number of respondents – 401 high-school students – such result might have an impact on the educational policy, when educators intend to plan programs aimed at the improvement of school performance.
Returning to the discussion about the dependent and independent variables, there is one more important detail to mention. You should understand that a given variable is referred to as dependent or independent based on the focus of investigation in a given research study (Weinbach and Grinnell, 2010). For example, the performance at the Baccalaureate was the dependent variable in the presented study, but could become the independent variable in a study which explore the influence of the result at the Baccalaureate on the result at the admission exam in the university.
You must consider what your independent and dependent variables are when you plan to conduct an experiment. In an experiment, the dependent variable is the outcome variable, or what the researcher expects to see as a result of the change following an intervention or treatment that has taken place. The intervention or treatment is called an independent variable. For example, if you design a school program to prevent the infection with HIV, then you can examine the effect of the type of teaching methods used (informative techniques versus interactive techniques) in two different groups of pupils in the ninth grade over their level of knowledge. In this case, the independent variable is the teaching method and the dependent variable is the level of knowledge about prevention of HIV infection acquired at the end of the program. You can evaluate the success of your intervention at the end of the program through a test containing for example 10 questions of knowledge noted with one point each, completed by both groups. By applying the independent samples t-test, you can find out if one type of program was more efficient than the other (the interactive versus the informative one). You can also see the progress of knowledge among the pupils from each group, if you apply the same test of knowledge at the beginning and at the end of the program, then analysing the results with paired samples t-test. If the mean of the class is significant higher at the end of the program, then you might conclude that your intervention was successful. Also, you can decide to use a group as the experimental group and the other as the control group. You do not intervene at all in the control group, but you conduct your intervention program in the experimental group. Both groups are evaluated with the same test of knowledge at the beginning and at the end of intervention, and results are compared with those of the control group. If the mean of notes in the experimental group is significant higher than that of the control group, then you can say that the experimental group is better informed about the prevention of HIV prevention than the control group. In the last section of this module you can find out what needs to be done to ensure the reliability and validity of such a test used to evaluate the level of knowledge of a group or sample.
Before passing to the measurement levels of variables, we want to draw your attention to some other important issues you should reflect on during conceptualization and operationalization of variables. At the beginning of your experience as a researcher, when you design an instrument for data collection, you may fall into the trap of conceptualizing and operationalizing the studied topic according to your own values, opinions, or prejudices. So, conceptualization and measurement should never be guided by bias or preferences for particular research outcomes, as it can produce a distorted image of reality. Below you can find an example of such a trap:
We have seen that most of the concepts of interest to social researchers are open to varied meanings. Suppose, for example, that you are interested in sampling public opinion on the abortion issue in the United States. Notice the difference it would make if you conceptualized one side of the debate as “pro-choice” or as “pro-abortion.” If your personal bias made you want to minimize support for having an abortion, you might be tempted to frame the concept and the measurements based on it in terms of people being “pro-abortion,” thereby eliminating all those who were not especially fond of abortion per se but felt a woman should have the right to make that choice for herself. To pursue this strategy, however, would violate accepted research ethics (…) There is no one, correct way to conceptualize this issue, but it would be unethical to seek to slant the results through a biased definition of the issue. (Babbie: 2013: 194).
One way to avoid this bias is to search for a standardized and agreed upon conceptualization and operationalization of the studied topic that you can find in the scientific literature or in reliable websites with statistical data (National Institute of Statistics, Eurostat, WHO, etc.). “Using standardized measures of concepts present two advantages: they have been extensively pretested and debugged, and studies using the same scales can be compared” (Babbie, 2013: 176). Below you can see an example of how you can measure the socio-economic situation of a family/ household, using the standardized concept of material deprivation as defined by the European Commission:
On the Eurostat platform, you can find that material deprivation “refers to a state of economic strain and durables, defined as the enforced inability (rather than the choice not to do so)” to “afford some items considered by most people to be desirable or even necessary to lead an adequate life”. Severe material deprivation rate is an indicator which measures the percentage of the people living in households that cannot afford at least four of the following nine items: 1) mortgage or rent payments, utility bills, hire purchase instalments or other loan payments; 2) to keep their home adequately warm; 3) to face unexpected financial expenses; 4) to eat a meal with meat, chicken, fish or a vegetarian equivalent every second day; 5) to go on a one week holiday away from home; 6) a colour television; 7) a washing machine; 8) a car; 9) a telephone (including mobile telephone). According to the Eurostat data, “in 2019, the highest rates were 19.9% in Bulgaria, 15.9% in Greece and 12.6% in Romania, while the lowest rates, varying between 2.7% and 2.4% were in Czechia, Germany, Denmark, Slovenia, the Netherlands and Finland”. Here, the unit of analysis is the country. When you measure this concept on different groups, let’s say different age groups, you can compare them on the basis of severe material deprivation: “in 2019, the EU-27 severe material deprivation rate for people aged less than 18 years was the same (5.8 %) as for persons aged 18-64 years. In all previous years (2010-2018) the rate for younger people had been higher.