Elektriska flickor och mekaniska pojkar Om gruppskillnader på prov - en metodutveckling och en studie av skillnader mellan flickor och pojkar på centrala prov i fysik

Detta är en avhandling från Umeå : Umeå universitet

Sammanfattning: This dissertation served two purposes. The first was to develop a method of detecting differential item functioning (DIF) within tests containing both dichotomously and polytomously scored items. The second was related to gender and aimed a) to investigate if those items that were functioning differently for girls and boys showed any characteristic properties and, if so, b) determine if these properties could be used to predict which items would be flagged for D1F. The method development was based on the Mantel-Haenszel (MH) method used for dichotmously scored items. By dichotomizing the polytomously scored items both types of item could be compared on the same statistical level as either solved or non-solved items. It was not possible to compare the internal score structures for the two gender groups, only overall score differences were detected. By modelling the empirical item characteristic curves it was possible to develop a MH method for identifying nonuniform DIF. Both internal and external ability criteria were used. Total test score with no purification was used as the internal criterion. Purification was not done for validity reasons, no items were judged as biased. Teacher set marks were used as external criteria. The marking scale had to be transformed for either boys or girls since a comparison of scores for boys and girls with the same marks showed that boys always got higher mean scores. The results of the two MH analyses based on internal and external criterion were compared with results from P-SIBTEST. All three methods corresponded well although P-SIBTEST flagged considerably more items in favour of the reference group (boys) which exhibited a higher overall ability. All 200 items included in the last 15 annual national tests in physics were analysed for DIF and classified by ten criteria The most significant result was that items in electricity were, to a significantly higher degree, flagged as DIF in favour of girls whilst items in mechanics were flagged in favour of boys. Items in other content areas showed no significant pattern. Multiple-Choice items were flagged in favour of boys. Regardless of the degree of significance by which items from different content areas were flagged on a group level it was not possible to predict which single item would be flagged for DIF. The most probable prediction was always that an item was neutral. Some possible interpretations of DIF as an effect of multidimen-sionality were discussed as were some hypotheses about the reasons why boys did better in mechanics and girls in electricity.