Pasos Fundamentales Para Realizar Adaptaciones de Pruebas Psicológicas

María Elena Brenlla; Mariana Soledad Seivane; Rocío Giselle Fernández da Lama; Guadalupe Germano

doi:10.46553/RPSI.19.38.2023.p121-148

Authors

María Elena Brenlla Pontifical Catholic University of Argentina. Research Center in Psychology and Psychopedagogy (CIPP) ; Distance University of Madrid (UDIMA) https://orcid.org/0000-0003-2536-9499
Mariana Soledad Seivane Pontifical Catholic University of Argentina. Research Center in Psychology and Psychopedagogy (CIPP) ; National Council for Scientific and Technical Research (CONICET) https://orcid.org/0000-0002-9162-6935
Rocío Giselle Fernández da Lama Pontifical Catholic University of Argentina. Research Center in Psychology and Psychopedagogy (CIPP) ; National Council for Scientific and Technical Research (CONICET) https://orcid.org/0000-0003-1529-2926
Guadalupe Germano Pontifical Catholic University of Argentina. Research Center in Psychology and Psychopedagogy (CIPP) https://orcid.org/0000-0003-2896-6272

DOI:

https://doi.org/10.46553/RPSI.19.38.2023.p121-148

Keywords:

psychometrics, scale, reliability, validity, psychological testing

Abstract

This article examines the challenges and importance of adapting psychological tests. Psychology, as a science that studies the mind and behavior, faces the unique complexity of assessing intangible constructs such as emotions, thoughts, and attitudes. Unlike other scientific disciplines, psychological measurements are often indirect and influenced by measurement errors. Consequently, it is crucial to ensure their reliability and validity. This article delves into the fundamental steps in making adaptations of psychological tests. A critical aspect of cross-cultural research is linguistic and cultural adaptation. Given that most psychological research comes from Anglo-Saxon countries, it is essential to modify tests to adapt them to diverse populations and languages. The authors highlight the importance of emic and ethical perspectives in understanding cultural and linguistic nuances. In addition, they address potential biases and heuristics that may influence test results. The role of adaptation in promoting a better understanding of human behavior in diverse cultural contexts is highlighted. Finally, a clear synthesis of the steps for test adaptation is presented, following the guidelines of the International Test Commission (ITC). The incorporation of cultural and linguistic considerations in test adaptation will undoubtedly improve the effectiveness and applicability of psychological assessments in diverse populations.

Downloads

Download data is not yet available.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (2014). Standards for educational and psychological testing. American Educational Research Association.

American Psychological Association. (2015). Dictionary of Psychology (2nd edition). American Psychological Association (APA). http://dx.doi.org/10.1037/14646-000

American Psychological Association, American Educational Research Association, & National Council on Measurement in Education (1999). Standards for educational and psychological test y manuals. Washington, DC: American Psychological Association.

Armor, D. J. (1973). Theta Reliability and Factor Scaling. Sociological Methodology, 5, 17-50. https://doi.org/10.2307/270831

Auné, S. E., Abal, F. J. P., & Attorressi, H. F. (2020). Análisis psicométrico mediante la Teoría de la Respuesta la Ítem: modelización paso a paso de una Escala de Soledad. Ciencias Psicológicas, 14(1), e-2179. https://doi.org/10.22235/cp.v14i1.2179

Attorressi, H. F., Lozzia, G. S., Abal, F. J. P., Galibert, M. S., & Aguerri, M. E. (2009). Teoría de Respuesta al ítem: Conceptos básicos y aplicaciones para la medición de constructos psicológicos. Revista Argentina de Clínica Psicológica, 18(2), 179-188. https://www.redalyc.org/pdf/2819/281921792007.pdf

Baca-García, E., Blanco, C., Sáiz-Ruiz, J., Rico, F., Diaz-Sastre, C., & Cicchetti, D. V. (2001). Assessment of reliability in the clinical evaluation of depressive symptoms among multiple investigators in a multicenter clinical trial. Psychiatry Research, 102(2), 163-173. https://doi.org/10.1016/S0165-1781(01)00249-9

Bartko, J. J. (1966). The Intraclass Correlation Coefficient as a Measure of Reliability. Psychological Reports, 19(1), 3–11. https://doi.org/10.2466/pr0.1966.19.1.3

Bean, G. J., & Bowen, N. K. (2021) Item Response Theory and Confirmatory Factor Analysis: Complementary Approaches for Scale Development. Journal of Evidence-Based Social Work, 18(6), 597-618, https://doi.org/10.1080/26408066.2021.1906813

Becerril-García, A., Aguado-López, E., Batthyány, K., Melero, R., Beigel, F., Vélez Cuartas, G., Banzato, G., Rozemblum, C., Amescua García, C., Gallardo, O., & Torres, J (2018). AmeliCA : Una estructura sostenible e impulsada por la comunidad para el Conocimiento Abierto en América Latina y el Sur Global. México: Redalyc; Universidad Autónoma del Estado de México ; Argentina : CLACSO; Universidad Nacional de LaPlata ; Colombia : Universidad de Antioquia. En Memoria Académica. Disponible en: http://www.memoria.fahce.unlp.edu.ar/libros/pm.693/pm.693.pdf

Behr, D. (2017). Assessing the use of back translation: the shortcomings of back translation as a quality testing method. International Journal of Social Research Methodology, 20(6), 573-584. https://doi.org/10.1080/13645579.2016.1252188

Berry, J. W. (1980). Acculturation as varieties of adaptation. En A. M. Padilla (Ed.), Acculturation: Theory, models and some new findings (pp. 9-25). Westview.

Bland, J. M. & Altman, D. G. (1990). A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methos of measurement. Computers in Biology and Medicine, 20(5), 337-340. https://doi.org/10.1016/0010-4825(90)90013-F

Bulat Silva, S. (2012). Saudade: A key Portuguese emotion. Emotion Review, 4(2), 203-211. https://doi.org/10.1177/1754073911430727

Brenlla, M. (2004). Aspectos socio culturales y métricos en la adaptación de tests: un estudio en base al test de inteligencia para adultos de Wechsler III (WAIS III). XI Jornadas de Investigación de la Facultad de Psicología de la Universidad de Buenos Aires, Buenos Aires. Recuperado de http://docplayer. es/23453801-Xi-jornadas-de-investigacion-facultad-de-psicologiauniversidad-de-buenos-aires-buenos-aires-2004

Brenlla, M.E.; Fernández Da Lama, R.G.; Otero, A. & Filgueira, P. (2023). The influence of titles on test validity: exploring the frame effect on the Beck Depression Inventory-second edition (manuscrito sin publicar).

Brenlla, M. E. & Rodríguez, C. M. (2006). Adaptación argentina del Inventario de Depresión de Beck (BDI-II). En BDI-II. Inventario de depresión de Beck (pp. 11-37). Paidós.

Brenlla, M. E., Fernández Da Lama, R. G.; Otero, A., & Filgueira, P. (2023). The influence of titles on test validity: exploring the frame effect on the Beck Depression Inventory-second edition (manuscrito sin publicar).

Campbell, D. T. & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81–105. https://doi.org/10.1037/h0046016

Carpenter, S. (2018). Ten Steps in Scale Development and Reporting: A Guide for Researchers. Communication Methods and Measures, 12(1), 25-44. https://doi.org/10.1080/19312458.2017.1396583

Casullo, M. M. (1999). La evaluación psicológica: Modelos, técnicas y contexto sociocultural. Revista Iberoamericana de diagnóstico y evaluación psicológica, 1(1), 97-113. https://aidep.org/03_ridep/R07/R077.pdf

Casullo, M. M. (2009). La evaluación psicológica: Modelos, técnicas y contextos. Revista iberoamericana de diagnóstico y evaluación psicológica, 1(27), 9-28. https://www.redalyc.org/pdf/4596/459645443002.pdf

Cattell, R. B. (1978). Factor Measures: Their Construction, Scoring, Psychometric Validity, and Consistency. The scientific use of factor analysis in behavioral and life sciences, 273-320. https://link.springer.com/chapter/10.1007/978-1-4684-2262-7_11

Choi, Y.-J. & Asilkalkan, A. (2019). R Packages for Item Response Theory Analysis: Descriptions and Features. Measurement: Interdisciplinary Research and Perspectives, 17(3), 168–175. https://doi.org/10.1080/15366367.2019.1586404

Christensen, A. P., & Golino, H. (2021). On the equivalency of factor and network loadings. Behavior Research Methods, 53(4), 1563–1580. https://doi.org/10.3758/s13428-020-01500-6

Christmann, A. & Van Aelst, S. (2006). Robust estimation of Cronbach’s alpha. Journal of Multivariate Analysis, 97(7), 1660-1674. https://doi.org/10.1016/j.jmva.2005.05.012

Cicchetti, D. V. & Showalter, D. (1997). A computer program for assessing interexaminer agreement when multiple ratings are made on a single subject. Psychiatry research, 72(1), 65-68. https://doi.org/10.1016/S0165-1781(97)00093-0

Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220. https://doi.org/10.1037/h0026256

Comrey, A. L. & Lee, H. B. (1992). Interpretation and application of factor analytic results. En A. L. Comrey & H. B. Lee (Eds.), A first course in factor analysis. Hillsdale, NJ: Lawrence Eribaum Associates. https://www.scirp.org/(S(351jmbntvnsjt1aadkposzje))/reference/ReferencesPapers.aspx?ReferenceID=2335989

Contreras Espinoza, S. & Novoa-Muñoz, F. (2018). Ventajas del alfa ordinal respecto al alfa de Cronbach ilustradas con la encuesta AUDIT-OMS. Revista Panamericana de Salud Pública, 42, e65. https://doi.org/10.26633/RPSP.2018.65

Cortada de Kohan, N. & Macbeth, G. (2007). El tamaño del efecto en la investigación psicológica. Revista de Psicología, 3(5) 25-31. http://bibliotecadigital.uca.edu.ar/repositorio/revistas/efecto-investigacion-psicologica-kohanmacbeth.pdf

Coolican, H. (2018). Research methods and statistics in psychology. Routledge.

Costello, A. B. & Osborne, J. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical assessment, research, and evaluation, 10(1), 7. https://doi.org/10.7275/jyj1-4868

Cronbach, L. J. (1972). Fundamentos de la exploración psicológica. Biblioteca Nueva.

Cronbach, L. J. & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and psychological measurement, 64(3), 391-418. https://doi.org/10.1177/0013164404266386

Cui, G., van den Berg, S., & Jiang, Y. (1998). Cross-cultural adaptation and ethnic communication: Two structural equation models. Howard Journal of Communication, 9(1), 69-85. https://doi.org/10.1080/106461798247122

DeVellis, R. F. & Thorpe, C. T. (2017). Scale Development: Theory and Applications. SAGE Publications.

Doval, E., Viladrich, C.,ânguloulo-Brunet, A. (2023). Coefficient alpha: the resistance of a classic. Psicothema, 35(1), 5-20. https://www.psicothema.com/pdf/4785.pdf

Elosua Oliden, P., (2003). Sobre la validez de los tests. Psicothema, 15(2), 315-321. https://www.redalyc.org/articulo.oa?id=72715225

Elosua Oliden, P. & Zumbo, B. D. (2008). Coeficientes de confiabilidad para las escalas de respuesta categórica ordenada. Psicothema, 20(4), 896-901. https://www.psicothema.com/pi?pii=3572

Everitt, B. S. (1975). Multivariate analysis: The need for data, and other problems. The British Journal of Psychiatry, 126(3), 237-240. https://www.cambridge.org/core/journals/the-british-journal-of-psychiatry/article/abs/multivariate-analysis-the-need-for-data-and-other-problems/4226888B99AA7C3F861B3B203050AC17

Farrell, P. (2006). Portuguese saudade and other emotions of absence and longing. In B. Peeters (Ed.), Semantic primes and universal grammar: Empirical evidence from the Romance languages (pp. 235- 258). John Benjamin.

Fernández Ballesteros, R. (2013). Evaluación psicológica. Conceptos, métodos y estudio de casos. Madrid: Síntesis. Síntesis.

Fernández Da Lama, R. G. & Brenlla, M.ªE. (2023a). Resultados preliminares en la evaluación de actitudes hacia el ahorro en economías inestables: importancia de los factores contextuales y socioeconómicos. Revista PUCE Pontificia Universidad Católica de Ecuador, 116. https://doi.org/10.26807/revpuce.vi

Fernández Da Lama, R. G., & Brenlla, M. E. (2023b). Attitudes towards saving and debt-taking behavior during first major flexibility on pandemic restrictions in Argentina. International Journal of Economic Behavior, 13(1), 51-70. https://doi.org/10.14276/2285-0430.3716

Ferrando, P. J. & Lorenzo-Seva, U. (2014). El análisis factorial exploratorio de los ítems: algunas consideraciones adicionales. Anales de Psicología, 30(3), 1170-1175. http://dx.doi.org/10.6018/analesps.30.3.199991

Ferrando P. J., & Lorenzo-Seva, U. (2018). Assessing the Quality and Appropriateness of Factor Solutions and Factor Score Estimates in Exploratory Item Factor Analysis. Journal of Educational and Psychological Measurement, 78(5), 762-780. https://doi.org/10.1177/0013164417719308.

Ferrando, P. J., Lorenzo-Seva, U., Hernández-Dorado, A., & Muñiz, J. (2022). Decálogo para el Análisis Factorial de los ítems de un test. Psicothema, 34(1), 7-17. https://doi.org/10.7334/psicothema2021.456

Fleiss, J. L. & Cohen, J. (1973). The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability. Educational and Psychological Measurement, 33(3), 613–619. https://doi.org/10.1177/001316447303300309

Gadermann, A. M., Guhn, M., & Zumbo, B. D. (2012). Estimating ordinal reliability for Likert-type and ordinal item response data: A conceptual, empirical, and practical guide. Practical Assessment, Research, and Evaluation, 17,1-13. https://doi.org/10.7275/N560-J767

Garaigordobil, M. (1998). Evaluación Psicológica: Bases teórico-metodológicas, situación actual y directrices de futuro. Amarú.

García-Nieto, R., Parra Uribe, I., Palao, D., Lopez-Castroman, J., Sáiz, P. A., García-Portilla, M. P., Sáiz Ruiz, J., Ibañez, A., Tiana, T., Durán Sindreu, S., Pérez Sola, V., de Diego-Otero, Y., Pérez-Costillas, L., Fernámdez García-Andrade, R., Saiz-González, D., Jiménez Arriero, M. A., Navío Acosta, M., Giner, L., Guija, J. A., … & Baca-García, E. (2012). Protocolo breve de evaluación del suicidio: fiabilidad interexaminadores. Revista de Psiquiatría y Salud Mental, 5(1), 24-36. https://doi.org/10.1016/j.rpsm.2011.10.001

Garnier-Villarreal, M. & Jorgensen, T. D. (2020). Adapting fit indices for Bayesian structural equation modeling: Comparison to maximum likelihood. Psychological Methods, 25(1), 46–70. https://doi.org/10.1037/met0000224

Gorsuch, R. L. (1983). Three methods for analyzing limited time-series (N of 1) data. Behavioral Assessment, 5(2), 141-154. https://psycnet.apa.org/record/1983-31741-001

van Griethuijsen, R. A. L. F., van Eijck, M. W., Haste, H., den Brok, P. J., Skinner, N. C., Mansour, N., Savran Gencer, A., & BouJaoude, S. (2015). Global Patterns in Students’ Views of Science and Interest in Science. Research in Science Education, 45(4), 581-603. https://doi.org/10.1007/s11165-014-9438-6

Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika 10, 255–282. https://doi.org/10.1007/BF02288892

Hancock, G. R. & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. In Structural equation modeling: Present and future (pp. 195-216).

Hays, W. L. (1960). A note on average tau as a measure of concordance. Journal of the American Statistical Association, 55(290), 331-341. https://doi.org/10.2307/2281746

Herman, B. C. (2015). The Influence of Global Warming Science Views and Sociocultural Factors on Willingness to Mitigate Global Warming. Science Education, 99(1), 1-38. https://doi.org/10.1002/sce.21136

Hernández, A., Hidalgo, M. D., Hambleton, R. K., & Gómez-Benito, J. (2020). International Test Commission guidelines for test adaptation: A criterion checklist. Psicothema, 32(3), 390-398. https://doi.org/10.7334/psicothema2019.306

He, J. & van de Vijver, F. (2012). Bias and Equivalence in Cross-Cultural Research. Online Readings in Psychology and Culture, 2(2), Article 8. https://doi.org/10.9707/2307-0919.1111

Horst, P. (1953). Correcting the Kuder-Richardson reliability formula for dispersion of item difficulties. Psychological Bulletin, 50(5), 371- 374. https://doi.org/10.1037/h0062012

International Test Comission. (2017). The ITC Guidelines for Translating and Adapting Tests (Second edition). https://www.intestcom.org/files/guideline_test_adaptation_2ed.pdf

Irvine, S. H. & Caroll, W. K. (1980). Testing and assessment across cultures: Issues in methodology and theory. En H. D. Trandis & J. W. Berry (Eds.), Handbook of Cross-culture Psychology Vol. 2: Methodology (pp. 181-244). Allyn & Bacon.

Jinyuan, L. I. U., Wan, T. A. N. G., Guanqin, C. H. E. N., Yin, L. U., & Changyong, F. E. N. G. (2016). Correlation and agreement: overview and clarification of competing concepts and measures. Shanghai archives of psychiatry, 28(2), 115-120. https://doi.org/10.11919/j.issn.1002-0829.216045

Kang, H. (2021). Sample size determination and power analysis using the G* Power software. Journal of educational evaluation for health professions, 18. https://doi.org/10.3352/jeehp.2021.18.17

Kahneman D. & Tversky A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291. https://doi.org/10.2307/1914185

Knekta, E., Runyon, C., & Eddy, S. (2019). One Size Doesn’t Fit All: Using Factor Analysis to Gather Validity Evidence When Using Surveys in Your Research. CBE—Life Sciences Education, 18(1), 1-17. https://doi.org/10.1187/cbe.18-04-0064

Kuder, G. F. & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151–160. https://link.springer.com/article/10.1007/BF02288391

Kyriazos, T. A. (2018). Applied psychometrics: sample size and sample power considerations in factor analysis (EFA, CFA) and SEM in general. Psychology, 9(08), 2207. https://www.scirp.org/html/15-6902564_86856.htm

Lapata, M. (2006). Automatic evaluation of information ordering’ Kendall's tau. Computational Linguistics, 32(4), 471-484. https://doi.org/10.1162/coli.2006.32.4.471

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4(1), 84–99. https://doi.org/10.1037/1082-989x.4.1.84

Marín, G. (1986): Consideraciones metodológicas básicas para conducir investigaciones en América Latina. Acta Psiquiátrica y Psicologica de America Latina, 32(3), 183-192. https://pesquisa.bvsalud.org/portal/resource/pt/lil-44521?lang=es

Martínez Arias, M. del R. (1995). Psicometría: Teoría de los tests psicológicos y educativos. Síntesis.

McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum Associates Publishers.

McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276-282. https://doi.org/ 10.11613/BM.2012.031

Merino Soto, C. & Charter, R. (2010). Modificación Horst al Coeficiente KR-20 por Dispersión de la Dificultad de los Ítems. Revista Interamericana de Psicología/Interamerican Journal of Psychology, 44(2), 274-278. https://www.redalyc.org/pdf/284/28420641008.pdf

Messick, S. (1995). Validity of psychological assessment: Validation of inferences fro’ persons' responses and performances as scientific inquiry into score meaning. American psychologist, 50(9), 741-749. https://doi.org/10.1037/0003-066X.50.9.741

Mikulic, I. M. (2007). Construcción y adaptación de Pruebas Psicológicas. https://acortar.link/xMByre

Mostowlansky, T., & Rota, A. (2020). Emic and Etic. En F. Stein, S. Lazar, M. Candea, H. Diemberger, J. Robbins, A. Sanchez, & R. Stasch (Eds.). The Cambridge Encyclopedia of Anthropology (pp. 1-16). University of Cambridge. https://boris.unibe.ch/154189/

Muñiz, J. (1998). La Medición de lo Psicológico. Psicothema, 10(1), 1-21. https://reunido.uniovi.es/index.php/PST/article/view/7442

Muñiz, J. (2010). Las teorías de los tests: teoría clásica y teoría de respuesta a los ítems. Papeles del Psicólogo, 31(1), 57-66. https://digibuo.uniovi.es/dspace/bitstream/handle/10651/10994/?sequence=1

Muñiz, J., Elosua, P., & Hambleton, R. K. (2013). Directrices para la traducción y adaptación de los tests: segunda edición. Psicothema, 25(2), 151-157. https://doi.org/10.7334/psicothema2013.24

Muñiz, J., & Fonseca-Pedrero, E. (2019). Diez pasos para la construcción de un test. Psicothema, 31(1), 7-16. https://digibuo.uniovi.es/dspace/bitstream/handle/10651/51958/Diez.pdf?sequence=1

Neto, F. & Mullet, E. (2014). A Prototype Analysis of the Portuguese Concept of Saudade. Journal of Cross-Cultural Psychology, 45(4), 660–670. https://doi.org/10.1177/0022022113518370

Pedrosa, I., Suárez-Álvarez, J., & García-Cueto, E. (2014). Evidencias sobre la Validez de Contenido: Avances Teóricos y Métodos para su Estimación. Acción Psicológica, 10(2), 3-20. http://dx.doi.org/10.5944/ap.10.2.11820

Polit, D. F. (2014). Getting serious about test–retest reliability: a critique of retest research and some recommendations. Quality of Life Research: An international journal of quality of life aspects of treatment, care and rehabilitation, 23(6), 1713–1720. https://doi.org/10.1007/s11136-014-0632-9

Preacher, K. J. & Coffman, D. L. (2006, May). Computing power and minimum sample size for RMSEA [Computer software]. http://quantpsy.org/.

Quesada Pacheco, M. Á. (2014). División dialectal del español de América según sus hablantes Análisis dialectológico perceptua. Boletín de filología, 49(2), 257-309. http://dx.doi.org/10.4067/S0718-93032014000200012

Revelle, W. (1979). Hierarchical Cluster Analysis and The Internal Structure Of Tests. Multivariate Behavioral Research, 14(1), 57-74. https://doi.org/10.1207/s15327906mbr1401_4

Rigo, D. Y. & Donolo, D. (2018). Modelos de Ecuaciones Estructurales usos en investigación psicológica y educativa. Revista Interamericana de Psicología/Interamerican Journal of Psychology, 52(3), 345-357. https://doi.org/10.30849/ripijp.v52i3.388

RStudio. (24 de Abril de 2015). RStudio. http://www.rstudio.com/about/

Sijtsma, K. (2009). On the Use, the Misuse, and the Very Limited Usefulness of Cronbach’s Alpha. Psychometrika, 74(1), 107-120. https://doi.org/10.1007/s11336-008-9101-0

Sireci, S. G. (1998). Gathering and analyzing content validity data. Educational Assessment, 5(4), 299-321. https://doi.org/10.1207/s15326977ea0504_2

Spearman, C. (1905). Proof and Disproof of Correlation. The American Journal of Psychology, 16(2), 228-231. https://doi.org/10.2307/1412129

Stevens, S. S. (1946). On the Theory of Scales of Measurement | Science. Science, 103(2684), 677-680. https://doi.org/10.1126/science.103.2684.677

Taber, K. S. (2018). The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education. Research in Science Education, 48(6), 1273-1296. https://doi.org/10.1007/s11165-016-9602-2

Taborda, A. R., Brenlla, M. E., & Barbenza, C. (2011). Adaptación argentina de la Escala de Inteligencia de Wechsler para Niños IV (WISC-IV). En D. Wechsler. Escala de Inteligencia de Wechsler para Niños IV (WISC-IV) (pp. 37-55). Paidós.

Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts and applications. Applied Psychological Measurement, 31(3), 245-248. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=1a0656fae7e2bed422fed08bedf0dab73203f325

Toland, M. D. (2013). Practical Guide to Conducting an Item Response Theory Analysis. The Journal of Early Adolescence, 34(1), 120–151. https://doi.org/10.1177/0272431613511332

Triandis, H. C., & Brislin, R. W. (1984). Cross-cultural psychology. American psychologist, 39(9), 1006-1016. https://doi.org/10.1037/0003-066X.39.9.1006

Tristán-López, A., & Pedraza Corpus, N. Y. (2017). La objetividad en las pruebas estandarizadas. Revista Iberoamericana de evaluación educativa, 10(1), 11-31. https://doi.org/10.15366/riee2017.10.1.001

Ursachi, G., Horodnic, I. A., & Zait, A. (2015). How Reliable are Measurement Scales? External Factors with Indirect Influence on Reliability Estimators. Procedia Economics and Finance, 20, 679-686. https://doi.org/10.1016/S2212-5671(15)00123-9

Vasconcelos, M. C. (1996). A saudade portuguesa. Guimarães Editors.

Ventura-León, J. L. (2016). Breve historia del concepto de validez en Psicometría. Revista peruana de historia de la Psicología, 2, 89-92. https://historiapsiperu.org.pe/wp-content/uploads/2021/08/Version-completa-del-volumen-2.pdf#page=89

van de Vijver, F. & Hambleton, R. K. (1996). Traslating tests: some practical guidelines. European Psychologist, 1(2), 89-99 https://doi.org/10.1027/1016-9040.1.2.89

Viladrich, C., Angulo-Brunet, A., & Doval, E. (2017). Un viaje alrededor de alfa y omega para estimar la fiabilidad de consistencia interna. Anales de Psicología/Annals of Psychology, 33(3), 755-782. https://revistas.um.es/analesps/article/view/analesps.33.3.268401

Weir, J. (2005). Quantifying test-retest reliability using the instraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research, 19(1), 231-240. https://doi.org/10.1519/15184.1

Zumbo, B. D. (2003). Does item-level DIF manifest itself in scale-level analyses? Implications for translating language tests. Language Testing, 20(2), 136–147. https://doi.org/10.1191/0265532203lt248oa

Zumbo, B. D. (2007). Three Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going. Language Assessment Quarterly, 4(2), 223–233. https://doi.org/10.1080/15434300701375832

Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal Versions of Coefficients Alpha and Theta for Likert Rating Scales. Journal of Modern Applied Statistical Methods, 6(1), 21-29. https://doi.org/10.22237/jmasm/1177992180

Zumbo, B. D., Liu, Y., Wu, A. D., Shear, B. R., Olvera Astivia, O. L., & Ark, T. K. (2015). A Methodology for Zumbo’s Third Generation DIF Analyses and the Ecology of Item Responding. Language Assessment Quarterly, 12(1), 136–151. https://doi.org/10.1080/15434303.2014.972559