SF 36 qualite de vie .pdf

Nom original: SF_36_-_qualite_de_vie.pdfTitre: PII: S0895-4356(98)00093-6Auteur: dtp

Ce document au format PDF 1.2 a été généré par FrameMaker5: PSPrinter 8.3 / Acrobat Distiller Daemon 3.02b for Solaris 2.3 and later (SPARC), et a été envoyé sur fichier-pdf.fr le 30/04/2015 à 10:14, depuis l'adresse IP 88.160.x.x. La présente page de téléchargement du fichier a été vue 861 fois.
Taille du document: 177 Ko (11 pages).
Confidentialité: fichier public

Aperçu du document

J Clin Epidemiol Vol. 51, No. 11, pp. 1013–1023, 1998
Copyright © 1998 Elsevier Science Inc. All rights reserved.

0895-4356/98 $–see front matter
PII S0895-4356(98)00093-6

The French SF-36 Health Survey: Translation, Cultural Adaptation
and Preliminary Psychometric Evaluation
Alain Leplège,1,* Emmanuel Ecosse,1 Angela Verdier,1 and Thomas V. Perneger2

Unit 292, Hôpital de Bicêtre, Le Kremlin-Bicêtre Cedex, France, and 2Institut de Medecine Sociale et Preventive,
Universite de Geneve, Geneva, Switzerland
ABSTRACT. This article reports on the main developmental stages and on the preliminary psychometric
assessment of the final French version of the SF-36. A standard forward/backward translation procedure was
followed. When translating survey items, the emphasis was placed on conceptual equivalence. When translating
response choices, we attempted to select a set of response choices that replicate the U.S. version. The distance
between the response choices was checked using visual analogue scales (N ⫽ 30). The adaptation procedure also
included formal ratings of the difficulty of the translation, of the quality of the translation, and of the equivalence
between the American source version and the French target version. The face validity was checked during lay
panel sessions at which the translated questionnaire was administered to subjects from the general public,
hospital employees, and subjects with a low level of education. Standard psychometric techniques were used to
evaluate the cultural adaptation of the SF-36, using data from a general population survey. The main objective of
this analysis was to determine how well the scaling assumptions (summated rating or Likert-type scaling
construction) of the SF-36 were satisfied. The results support the claim that the scaling properties of the French
version of the SF-36 are adequate and that health outcomes may be reliably assessed using this version of the
instrument. J CLIN EPIDEMIOL 51;11:1013–1023, 1998. © 1998 Elsevier Science Inc.
KEY WORDS. Functional status, health status, health related quality of life, SF-36, IQOLA, France

In matters of health, the point of view of the patient is
gaining importance in decision-making procedures. Thus,
decision makers require indicators that will provide information about the way in which patients see their own
health. The SF-3 Health Survey was designed to meet this
need [1]. The adaptation into French of the SF-36 that is
presented here is part of the International Quality of Life
Assessment (IQOLA) project. The project aims to develop
a validated instrument for measurement reflecting the patient’s point of view and enabling international comparisons of health outcomes. The IQOLA project research
team laid out a standardized translation and adaptation procedure that has been followed by the French scientific
team. The main objective of IQOLA is to adapt the SF-36

*Address correspondence to: Alain Leplège, M.D., Ph.D., Mesure de la
Santé Perceptuelle et de la Qualité de Vie, INSERM U. 292, Hôpital de
Bicêtre, 82, rue du Général Leclerc, 94276 Le Kremlin-Bicêtre Cedex,
Accepted for publication on 7 July 1998.

into more than 15 languages for use in international studies
of health outcomes [2]. There are two main reasons behind
enterprises of this sort: the first is that adaptation provides a
very cost-effective access to state-of-the-art quality of life
(QoL) measures in the target language or culture; the second is that if the adaptation is successful, such translated
instruments can be used in international comparative studies. These benefits are, however, conditioned by a certain
number of constraints specific to the cross-cultural procedure that must be respected.
The SF-36 had previously been translated into French by
several research teams. The Canadian version [3] could not
be used directly in France owing to linguistic differences in
Canadian French. The psychometric properties of another
pre-IQOLA French (France) version of the SF-36 have
been described by Bousquet et al. [4] and Bullinger et al. [5]
from results obtained from asthmatic patients and rhinitis
patients, respectively. A Swiss French version using a relatively simple translation procedure has been published by
Perneger et al. [6].
This article describes the successive stages in the translation and adaptation procedure used to develop the final


French version of the SF-36 within the IQOLA project and
provides a preliminary analysis of the psychometric properties of this final French version (version 1.3), which has
been administered to a sample drawn from a representative
panel of the French population. The factorial structure of
scores is also examined and compared with the American
The SF-36
The SF-36 is a generic health status measurement instrument. It can be used to assess health status independent of
which diseases or illnesses affect the population under
study. It is made up of 36 questions divided into 8 scales:
physical functioning (PF1 to PF10), role limitations relating to physical health (RP1 to RP4), bodily pain (BP1 and
BP2), general health perceptions (GH1 to GH5), vitality
(VT1 to VT4), social functioning (SF1 and SF2), role limitations relating to mental health (RE1 to RE3), mental
health (MH1 to MH5), and health transition (HT). The
SF-36 is easy to administer and to score. The average filling-in time is 5 to 10 minutes. It can be self-administered or
administered by personal interview or by telephone. It has
been tested in numerous studies implemented on an international scale [7]. The responses of subjects are presented as
a profile of scores calculated for each scale [8].

Translation and Cultural Adaptation Methods
The translation and cultural adaptation of the SF-36 followed the IQOLA methodology described elsewhere in this
issue. Five initial forward translations were made independently by five translators who were experienced in health
status questionnaires but not familiar with the SF-36. English was the native language of one; the other four were
native speakers of the target language (French). These
translations, as well as the previous French and Canadian
versions, were circulated to the members of the translation
panel, who drew up independent proposals. Following this,
a seminar was organized. In the course of this meeting, all
the options were reviewed and the main translation choices
were made. Decisions concerning problematic issues were
made after tests on nonbilingual subjects. Throughout the
translation process, we treated separately translations of the
item (e.g., the General Health Question-GH1: In general,
would you say your health is . . . ?) and of the responses
choices—RC—(e.g., excellent, very good, good, fair, poor).
TRANSLATION OF THE ITEMS. For the translation of the
questions, the objective was to ensure that the French version of the SF-36 referred to a concept of health comparable to that operationalized by the American version. The
task of adaptation consisted of work on the full understanding of the conceptual structure underlying the original

A. Leplège et al.

SF-36 to transpose it into French. The degree of freedom in
rendering the French version was guided and limited by a
certain number of factors.
First considered was the American instrument itself,
which was naturally the main referent, complimented by
the literature detailing the concepts, the conditions in
which the instrument was developed, and the origins of its
different items. For the wording of the French version, certain rules were applied. One of the difficulties encountered
was to arbitrate between such rulings when they became
contradictory. The linguistic register of the questionnaire
was equated to that of a person of 12 to 14 years of age,
which led to excluding, as far as possible, technical terms or
oversophisticated, pedantic, or formal words or phrases.
Next, the team aimed at producing as simple a version as
possible, made up of short sentences. The subject being
questioned is addressed personally in an informal, almost familiar, manner generally in the second person, as in the
American version. Care was taken to avoid introducing
ambiguities not present in the American source. Finally,
stylistic considerations were taken into account.
After an equivalent rendering of the original, the second
objective of the translation team was to achieve good acceptability (face validity) for the French version. Success in
this aim was checked by critical appraisal of the intermediate versions of the questionnaire by subjects from the general French population (face validity tests). These tests
made it possible to highlight ambiguities that might have
escaped the translators, to make sure that questions were
relevant, and to make choices between certain alternative
renderings. Indeed, cases can arise for where there is no direct equivalent rendering in French for the American expression, either because none exists or because the possible
equivalent is not in the same linguistic register.
translation of the 9 different sets of response choices, the
team was given the task of producing as many response
choices as possible, with a view to subsequent empirical selection of the intermediate and extreme response choices to
comply with a certain number of criteria regarding ordinality, interval spacing, and comparability with the American
version in terms of range of health values. Once the pool of
response choices was established, several sets of intermediate and extreme response choices were selected for testing.
The number of sets tested in French varied according to the
dimensions explored. One or two sets of response choices
were tested in French for each set in the source version for
the dimensions HT, PF, SF, and BP. In the case of GH1
(overall rating of health from excellent to poor), 8 different
sets of extreme and intermediate response choices were
tested, as were 10 response choices without indicating the
extremes. In the case of VT and MH (all of the time to none
of the time), 6 sets of anchors/intermediate response choices
were tested, and in the case of GH2-5 (definitely true to defi-

The French SF-36 Health Survey

nitely false), 3 sets were tested. For each tested set, subjects
were asked to indicate on a visual analogue scale the distance between each intermediate response choice and the
two extremes, one response choice at a time. The order in
which these exercises were performed was random, and two
booklets (one in reverse order of the other) containing all
the exercises were submitted, each to half of two convenience sample populations of 30 and 34 subjects. Analysis
of results was based on the number of cases that did not satisfy the criteria of ordinality (in terms of mean value) between response choices in a given set and the ordering of
the source version. For each response choice within a given
set, the difference between the average score observed and
the SF-36 score was calculated, and the sum of these differences (in absolute value) was calculated for each response
set. The response choices were then selected according to
the following sequential decision criterion: to minimize the
number of ordinal differences with the ordering of each
source version response set and to minimize the difference
with the actual scoring of the source version.
ASSESSMENT OF TRANSLATION QUALITY. The IQOLA procedure also provided for evaluation on a scale of 0 to 100 by
two independent “judges” of the difficulty and quality of
translations and also of the equivalence achieved between
the American and French versions. The results of this assessment were not extensively used in the translation seminar. The face validity and acceptability of the French version
were checked during lay panel sessions. After completion of
this procedure, test version 1.1 of the SF-36 was issued.
This version was then back-translated by two independent
translators, and the validity of the French version’s content
was checked by the American team (J. Ware and B. Gandek). The back-translation led to the modification of one
response choice in item GH1, which led to version 1.2
(from excellente, tres bonne, bonne, moyenne, mauvaise to excellente, tres bonne, bonne, mediocre, mauvaise). Subsequently, this version was revised in light of users’ comments
and psychometric analysis. The resulting French version
1.3 of the SF-36 was then used in the survey presented here.

Evaluation of the Psychometric Properties of the
Translated Version
Generally, the analysis plan provided by the IQOLA
project for validation of each national version was followed.
DESCRIPTIVE STATISTICS. The number and distribution of
missing data were examined to get an idea of the acceptability of the instrument. The distribution of responses for
each question was assessed visually. The mean and standard
deviation for responses to each question were calculated, as
well as the mean and empirical variance of scales scores.
The percentage of responses on anchor points (extremes)
was examined for each scale to detect floor or ceiling effects.


MULTITRAIT ANALYSIS. The SF-36 is composed of Likerttype scales, and scales scores are calculated by summing the
scores of each item within each dimension. The aim of this
series of analyses is to verify that the assumptions underlying
the calculation of the SF-36 scores are met so that the item
score could be summed. Five assumptions were tested [8,9].
The first assumption is termed item internal consistency:
The responses to each question should be correlated in a
linear manner with the score of the scale to which it belongs (for instance, there should be a linear correlation of
the PF items with the PF scale). For item internal consistency to be achieved, the correlation (corrected for overlap) between the score for each item, and the total score for
the scale to which each belonged should be moderate to
large (i.e., above or equal to 0.4) [10]. This property can
also be established by checking that there is a monotonic
and possibly linear relationship between the response
choice for an item and the scale score constructed from all
other items in that scale, in scales with more than two questions. If the monotonic relationship is not found, reweighting of responses should be envisioned [11].
The second assumption states that the score for each
question should show greater correlation with the score of
its hypothesized scale than with the other scales. This is
known as item discriminant validity. Because of the relatively small sample size in this analysis, the percentage of
scaling successes was calculated to show the number of
times items had higher correlations or significantly higher
correlations with their hypothesized scale than with other
The third assumption implies that questions belonging to
the same scale and measuring the same concept should
show approximately the same variance. Should this not be
so, standardization or weighting of results should be considered. If each question contributes equally to the scale score,
the responses to each question can be equally weighted.
According to the fourth assumption, responses to a question should contain approximately the same amount of information about the concept being measured. In other words,
all the questions in a scale should show about the same correlation with the scale (without the question concerned).
The final assumption is that the scale scores obtained
should be reliable and interpretable. To check reliability,
the Cronbach alpha coefficient was calculated to estimate
the proportion of true score obtained in relation to error in
measurement. Coefficient alpha should exceed 0.7 for
group comparisons [12]. To assess the reliability of an instrument, it is also advisable to calculate the correlation coefficients between scores after test–retest over an interval of
2 to 3 weeks. Our data does not include this information.
FACTOR ANALYSIS. Construct validity, that is, the process
in which validity is evaluated as the extent to which a measure correlates with variables in a manner consistent with
theory [14], is studied when the variable involved (health


status) cannot be observed directly and there is no standard
reference [15,16]. Factor analysis is one way to evaluate
construct validity. Factor analysis with varimax rotation
was carried out on correlations among scales to compare
the factorial structure of data with that obtained from the
American instrument. According to American studies, using responses obtained from the general population, it is
possible to group the scales according to their contribution
to two main factors, or second-order scales, mental and
physical [17]. The authors distinguish three situations:
when the average score from a scale is substantially correlated (i.e., more than 0.7) with the physical factor and only
slightly with the mental factor (less than 0.3), when it is
substantially correlated with the mental factor and only
slightly with the physical, and when the score is moderately
correlated with both factors (between 0.3 and 0.7). Thus,
scales PF, RP, and BP should be strongly correlated with the
physical factor; the scales MH, RE, and SF strongly correlated
with the mental factor; GH and VT correlated with both.
SAMPLE. Psychometric validation data for version 1.3 of
the SF-36 were obtained from a study of a sample of 209 individuals from a representative panel of the French population [18], which was used to test the feasibility of conducting a norming survey in France. Three respondents
provided inconsistent responses to some questions and were
excluded from the analysis. The individuals were recruited
by mail during August and September 1994. There were an
equal number of men and women, with an average age of 45
(minimum 15, maximum 93). Fourteen percent had primary education status; 38%, secondary schooling; 48%,
high education, and 53% were not working or employed.
ANALYTICAL SOFTWARE. Statistical analysis was carried out
using SAS and the Multitrait Analysis Program [19,20] Revised (MAP-R [21]).

Translation and Cultural Adaptation of the SF-36
into French
The main translation problems encountered in the course
of the development of the French version are described
next. Appendices 1 and 2 provide a comparison of the content of the American and French forms.
It was intended that the overall presentation of the questionnaire not include any sort of ambiguity. Thus the translation of your health (introductory sentence) gave rise to
some discussion. In French, in the perspective of an instrument measuring perceived health, it was possible to opt either for votre santé (your health) or votre état de santé (your
state of health). The translation team believed there was a
distinction to be made between the two. In French, santé is
feminine, whereas état de santé is masculine, which has repercussions consequently in the agreement of adjectives
used for RC and also possibly on their impact. It was con-

A. Leplège et al.

sidered that there is a difference in connotation between
santé and état de santé, santé being something more personal
and intimate than état de santé, which was considered to be
more technical and conceptual. On account of the generally abstract character of the assessment required of the respondents, questions refer either to their state of health
(état de santé) (RP1-4, SF1-2) or to one of its main components, their physical state (état physique) (RP1-4, SF2) or to
their emotional state (état émotionnel) (RE1-3, SF2). In the
case of items GH1, GH5, which require more intuitive
overall assessments, subjects are questioned on their health
Because the SF-36 aims to capture and quantify subjective impressions of patients, the respondents are reminded
that it is they who are being questioned, as, for example, in
the phrase would you say rendered as pensez-vous in item
GH1. This objective was maintained in the translation
through expressions such as ressentez-vous (introduction),
trouvez-vous (HT), vous sentir (introduction to RE), or vous
vous étes senti(e) (item 9-VT1-4 and MH1-5) (where the
verb being pronominal [reflexive] and requiring the option
of feminine agreement for female respondents reinforce the
personal nature of the questions).
The reference context of a perceived health indicator is
not the medical framework (it is not a clinical indicator),
but the normal life setting of the individual being questioned. Several expressions are used in the American version to remind respondents of this (usual activities, typical
day, regular daily activities, etc.). The French version refers
to la vie de tous les jours, i.e., everyday life, excluding extraordinary or unusual days. This expression follows the
American while being simpler. It was preferred in the introductory sentence to a word-for-word translation of usual activities, as the word activités, although seemingly equivalent,
would not be used in such a context in French. In PF this
same expression is also used to translate a typical day, as the
apparently closer rendering une journée typique was believed
to be unlikely to be readily understood in French.
Initially, measures of health status mainly concerned
function (psychological, social, and physical) of subjects.
This is still very present in the American questionnaire, in
which numerous items explore the way in which subjects
carry out their usual activities and fulfill their normal role.
This approach has been maintained in the French version.
The word activity was translated to activité in PF, RP, and
BP. In RP, the verb faire was used in conjunction with two
direct objects travail (work) and toute autre activité (any other
activity). This was believed to convey the idea of any objective, concrete, material activity, outside the respondent’s
self. In SF, your social activities was rendered by the expression dans votre vie et vos relations, an expression that recalls
la vie de tous les jours; in PF, activités sociales would be narrower and refer to something more formal. The English
word normal, used to define the activity in SF, would connote the idea of norme (instead of the idea of usual activity)

The French SF-36 Health Survey

if translated as such by normal in French. It was therefore
There are several examples of such adaptation in the SF36, such as the transposition of walking one block to a notion
of approximate distance, as the French urban layout renders
the American expression meaningless, or the addition of
douche (shower) to item 3j to avoid nonresponse by subjects
in France where showers are more common than baths.
Other changes were made to maintain the desired linguistic
register. For instance the word phrase (sentence) was preferred to the word affirmation (statement) to translate statement in GH 2-5 because the latter term in French would
probably not be understood by all respondents. One instance of another sort of modification can be seen in the
rendering of compared to one year ago (HT), which was
translated by par rapport l’année dernière à la même époque (in
comparison with last year at the same time/period). This translation was preferred to the stricter rendering par rapport à il
y a un an, which, although shorter, is not so immediately
clear (the accumulation of short words in French, unlike
English, does not necessarily make for clarity), as the English version does not refer back to an exact date in the calendar. Another example of stylistic consideration taken
into account would be the avoidance in French of the conditional in the rendering of would you say, translated by
pensez-vous (do you think), because the conditional diriezvous in French is rather too formal. On the other hand, the
redundancies characteristic of English were on the whole
maintained in French, especially in cases in which the
translation posed a problem (for instance, downhearted and
blue translated by triste et abattu[e]).
Finally, the translation of the SF-36 imposed constraints
that were sometimes difficult to reconcile with a “good” rendering in terms of French style. For example, the RC in question 9 (VT and MH) are relatively complex because the
verbs required are pronominal and because the feminine
agreement option was also necessary (a female respondent in
French could feel excluded if this were not provided, which
could influence the way in which she responds).
Another example of a difficulty typically arising in the
translation of English questionnaires into French is the rendering of expressions such as how well, how much, and how
true or false, which have no neat equivalent in French. For
example, in BP1, how much bodily pain becomes quelle a été
l’intensité de vos douleurs physiques, as the quantitative combien (how many) with a plural (believed to be necessary in
French) suggests number or frequency rather than degree or
intensity. In GH 2-5, a similar problem with translating
how true or false was met by a fairly colloquial French expression dans quelle mesure (to what degree).
Transition from Version 1.1 to 1.3
Version 1.1 had been put to immediate use in several phase
IV clinical studies in populations suffering angor or arthritis


(the results of these studies have been presented elsewhere
[22,23]). Although analysis of results from these studies was
very encouraging, one issue remained: The factor analysis
did not group the dimensions of the French version in the
same manner as in the source version. In the first population (suffering from angor), dimensions with a correlation
coefficient above 0.7 were for axis I: MH, VT, and GH, and
for axis II: RP and RE. In the other group(suffering from arthritis), the dimensions with correlation with axis I above
0.7 were RP, RE, and with axis II, PF, VT, and GH. It is the
absence of distinction between RE and RP that posed the
biggest problem. The significance of these results was discussed with the American team, with the help of the backtranslations. It appeared that the translation of limited by
gêner (bothered) did not conform to the intentions of the
American authors, which led to the decision to replace
gêner as a rendering of limited or interfered by limité in questions PF, RP, and BP, to emphasize the “physical” nature of
such a limitation or interference. The back-translations led
also to alteration of response choices for question GH1. Initially, the response choices chosen were those that achieved
the most equal spacing possible on empirical tests (excellente, très bonne, bonne, moyenne, mauvaise). But application of this rule did not minimize deviation between the
French version and the MOS SF-36 reweighted intervals.
The problem probably arose from the fact that a word
(here, moyenne) is not subjectively perceived in the same
manner from one person to another: We have observed
that moyen is considered by some people to refer to something rather less than desirable, by others to something acceptable or even a little on the better side. The word médiocre was selected (excellente, très bonne, bonne, médiocre,
mauvaise) as being more accurately positioned between
bonne and mauvaise, which led to version 1.2 of the instrument. A number of other formal or stylistic alterations suggested by users or attentive readers were made, finally producing version 1.3, which is discussed here.
Descriptive Statistics of Items
In the sample of 206 respondents, the amount of missing
data was very small, as only 0.5% of all the answered questions received no response, which indicates that the questionnaire has good acceptability (see Table 1). These missing data were spread evenly over the different scales. On
account of the treatment recommended for missing data in
MAP-R (the patient is not taken into account in the analysis if in any one dimension or scale more than half the responses are missing), only one subject was excluded from the
analysis. Overall, all the RCs were used for all questions.
The difference between maximum and minimum standard deviation for responses to questions in a given scale is
given in Table 2. This difference is greater for GH and PF
(range of deviation ⬎0.35). It is 0.2 for VT and 0.3 for BP.
It is relatively low for the others (⬍0.10).


A. Leplège et al.

TABLE 1. Description of eight scales of the MOS 36-item Short Form Health Survey (French version 1.3) submitted to 209

Number of items
Percentage nonrespondents
Mean score
Standard deviation
Percentage at ceiling
Percentage at floor




















ITEM INTERNAL CONSISTENCY. All correlations between a
question in a scale and the scores for other questions in the
same scale were above 0.4. Furthermore, within a given
scale, these correlations are relatively close (Table 2). The
correlation coefficient range for each scale varies, from 0.29
for PF to 0.08 for RP. This result is consistent with the
number of questions per scale and their heterogeneity. For
instance, the scale for PF includes 10 questions, which span
a wide spectrum of physical activities, whereas VT and GH
are shorter and more homogenous scales. In addition, we
assessed the ordinality of responses choices visually for all
scales with more than two questions. These hypotheses
were generally met (not shown).
ITEM DISCRIMINANT VALIDITY. The scores for each question
are generally significantly more closely correlated with
their scale than with others. There is one exception: VT3 is
more closely correlated with the dimensions SF and MH
than with its own dimension, VT. These results indicate
that the questions in the SF-36 possess good discriminant
validity (see Table 2).




Analysis of the range of standard deviation for each scale
brings out two groups: PF and GH, for which the difference
between the extreme standard deviations is quite large
(generally above 0.4), and RP, BP, VT, SF, RE, and MH,
for which the difference between the extreme standard deviations is relatively small (generally below 0.2) (see Table
2). the PF scale comprises 10 questions, some of which are
easy (e.g., bathing and dressing) and some of which are
hard (e.g., vigorous activity); it is therefore not surprising
that the range is greater in a general population. Because
GH and VT are by definition the most global concepts, the
standard deviation for each is high. These results are again
in line with the assumptions that responses to questions belonging to a homogeneous dimension contain the same
amount of information about the concept being measured.
SCALE INTERNAL CONSISTENCY RELIABILITY. The Cronbach alpha coefficient, which estimates the proportion of total variance attributable to true differences in scores, is always above
0.85 (see Table 3). This indicates good reliability and validates
the use of this instrument for between-group comparisons.

TABLE 2. Scaling properties of the MOS 36-item Short Health Survey (French version 1.3)

Physical functioning
Role physical
Bodily pain
General health
Social functioning
Role emotional
Mental health

Item standard

Correlations of
item with own
scale, corrected
for overlap

Correlations of
items with
other scales

Convergent validity:
item correlations with
own scale

Discriminant validity:
item correlations with
own scale significantly
(P ⬍ 0.05) greater
or greater than with
other scale




10/10 (100%)
4/4 (100%)
2/2 (100%)
5/5 (100%)
4/4 (100%)
2/2 (100%)
3/3 (100%)
5/5 (100%)

80/80 (100%)
32/32 (100%)
16/16 (100%)
40/40 (100%)
30/32 (93.8%)
16/16 (100%)
24/24 (100%)
40/40 (100%)

The French SF-36 Health Survey


TABLE 3. Reliability and interscale correlations of the MOS 36-item Short Health Survey (French version 1.3)

Correlation between scales (not corrected for attenuation)

Physical functioning
Role physical
Bodily pain
General health
Social functioning
Role emotional
Mental health

















sample enjoyed
relatively good health, and for 26 of the questions in the
SF-36, the response reflecting the best state of health was
most often chosen. Substantial ceiling effects were seen for
five scales (Table 1). These results reflect the good health
status of the surveyed population.

The most apparent “alterations” to the American version
are in fact an integral part of the process of translation into
another language and culture. For instance, it is extremely
common for a word in English to have an equivalent in
French in terms of form (being from the same root) but for
the two words to cover different areas of meaning, following
divergent evolution from the historical root. These areas of
meaning frequently overlap to some extent, which can lead
to the wrong assumption that they are synonymous.
In the translation of the SF-36, various types of transposition strategies can be distinguished. The first and most obvious is cultural adaptation of the “superficial” content to
make the underlying concept being measured accessible to
the other culture. Syntactic changes, likewise, are sometime desirable or necessary. They are generally purely routine in translation, but in a questionnaire they can be im-

Factor Analysis
The factor analysis did not reproduce exactly the structure
seen for the original instrument (Table 4). It does support
the existence of two basic dimensions of health (physical
and mental), but the correlation between these factors is
higher in France than in the United States [17]. However,
factor analysis results can be influenced by many conditions
other than the relationship between underlying concepts, including the skewness of scores. As some of our scores were very
skewed (e.g., PF, RP and RE), this could affect the results.

TABLE 4. Hypothesized associations between scales of the MOS 36-item Short Health Survey

(French version 1.3) and postulated physical and mental components of health, compared with
results of a factorial analysis
Factorial analysis: rotated
principal components


Physical functioning
Role physical
Bodily pain
General health
Social functioning
Role emotional
Mental health

Correlation with











aSame as in McHorney et al. [17]: ⫹ strong association (r ⬎ 0.7); * moderate association (0.3 ⬍ r ⬍ 0.7); ⫺ weak
association (r ⬍ 0.3).
The percentage of measured variance explained by these two factors is 75%.


portant because they may affect the way in which questions
are asked or the way in which they are hinged on the RCs.
Sometimes, the same source word will be rendered in different ways in French, or, conversely, different source expressions may be translated by a single word or expression
(compare, e.g., the translation of activity, limited, and interfere). This sort of strategy reflects the fact that words are
polysemous, but not in an equivalent way from one language to another, even when linguistic roots are shared.
Context also has a powerful effect on the impact of a word,
which, again, is not the same from one language to the other.
Finally, it should be noted that certain aspects of the
source version have been strictly maintained in the French:
The format is extremely close to the American version, as it
was not thought to be inappropriate to French respondents
in any way. The RCs respect the mode used in the American source (degree, intensity, frequency, etc.); they also
comply with the structure of the American in that sets of
adjectives, comparatives, substantives or substantivated
verbs, and so on are used as in the source. Certain redundancies were also maintained despite being less characteristic of French, because they were believed to contribute to
clarity and to enable the concept to be better defined.
The sort of adjustment described in the Results sections,
which can also be seen as a compromise in some instances,
is no doubt inevitable, but it could well explain why a translated questionnaire does not necessarily obtain quite as satisfactory psychometric results as does the source version.
Psychometric Properties of Version 1.3
Version 1.3 of the French SF-36 has good psychometric
properties. Indeed, the multitrait analysis used of our data
suggests that (1) virtually all the questions of version 1.3 of
the SF-36 are correlated in a linear manner to the concept
being measured, (2) these questions are better correlated
with their own scale than with the other scales, (3) questions referring to the same concept have approximately the
same variance, (4) within a given scale the questions contain about the same amount of information on the concept
measured, and (5) the scores obtained are reproducible (reliability). Thus, it can be said that the main assumptions
underlying the SF-36 score design are preserved when compared with the original American form [24] and that it
should, therefore, be possible to use the instrument to measure health-related quality of life in France. This is, however, only a preliminary analysis of the psychometric properties of the French version of the SF-36. Certain essential
psychometric characteristics of this version of the SF-36
still need to be described and require further data collection. A reproducibility study to calculate the test–retest
correlation coefficient should be envisaged. Likewise, the
capacity of the instrument to discriminate between groups
of patients should be studied, as should score sensitivity to
clinical changes by way of longitudinal studies.

A. Leplège et al.

We acknowledge the pioneering work of Denis Bucquet, who initially
headed the French team of the International Quality of Life Assessment
(IQOLA) project. This endeavor would not have been possible without
the continuing help and advice of Barbara Gandek and John Ware from
the New England Medical Center. We also acknowledge the useful
comments of the following individuals: Ann Smet (Glaxo Belgique),
Jean Pierre Dreyfus (Sofres Medicale), Patrick Marquis and Catherine
Acquadro (Mapi, études domaine médical), Emmanuel Picavet (Université de Paris I), Anonymous (Medtronic), and, last but not least,
the anonymous reviewers of this article who provided extremely useful
comments and suggestions. The initial translation group of the SF-36
was composed of Denis Bucquet, Alain Leplège, Marga Berr, Angéla
Verdier, Bruno Cadet, and Stephanie Condon. The back-translations
were made by Kathy Bean and Louise Burchill.

1. Ware JE, Donald Sherbourne C. The MOS 36-item shortform health survey (SF-36): I. Conceptual framework and
item selection. Med Care 1992; 30: 473–483.
2. Aaronson NK, Acquadro C, Alonso J, et al. International
quality of life assessment (IQOLA) project. Qual Life Res
1992; 1: 349–351.
3. Wood-Dauphinee S, Gauthier L, Gandek B, Mangan L, Pierre
U. Readying a US measure of health status, the SF-36, for use
in Canada. Clin Invest Med 1997; 20: 224–238.
4. Bousquet J, Knani J, Dhivert H, Richard A, Chicoye A, Ware
JE, et al. Quality of life in asthma. I. Internal consistency and
validity of the SF-36 questionnaire. Am J Respir Crit Care
Med 1994; 149: 371–375.
5. Bullinger M, Bousquet J, Marquis P, Fayol C, Valentin B, Burtin B. Quality of life assessment in rhinitis: Results of the SF36 Health Survey in a clinical trial. Qual Life Res 1994; 3:
6. Perneger TV, Leplège A, Etter JF, Rougemont A. Validation
of a French language version of the MOS 36-item Short Form
Health Survey (SF-36) in young healthy adults. J Clin Epidemiol 1995; 48: 1051–1060.
7. Ware JE. SF-36 Health Survey. Manual and Interpretation
Guide. Boston, MA: The Health Institute; 1993.
8. Medical Outcomes Trust. How to Score the SF-36 ShortForm Health Survey. Boston, MA: The Medical Outcomes
Trust; 1992.
9. Likert RA. A technique for the measurement of attitudes.
Arch Psychol 1932; 140: 5.
10. Stewart AL, Ware JE. Measuring Functioning and WellBeing: The Medical Outcomes Study Approach. Durham,
NC: Duke University Press; 1992.
11. Bernstein IH. Applied Multivariate Analysis. New York:
Spinger Verlag; 1988.
12. Nunnaly JC. Psychometric Theory. New-York: McGrawHill; 1978.
13. Moret L, Mesbah M, Chwalow J, Lellouch J. Validation
interne d’une échelle de mesure: Relation entre ACP, coéfficient alpha de Cronbach et coéfficient de corrélation intraclasse. Rev Epidemiol Sante Publique 1993; 41: 179–186.
14. Ware JE, Snow KK, Kosinski M, Gandek B. SF-36 Health
Survey Manual and Interpretation Guide. Boston, MA: The
Health Institute, New England Medical Center; 1993.
15. Streiner DL, Norman GR. Health Measurement Scales—A
Practical Guide to their Development and Use. Oxford:
Oxford University Press; 1989.
16. Kaplan R, Berry CC. Health status: Types of validity and the
index of well-being. Health Serv Res 1976; 478–507.

The French SF-36 Health Survey

17. McHorney CA, Ware JE, Raczek AE. The MOS 36-item
Short-Form Health Survey (SF-36): II. Psychometric and
clinical tests of validity in measuring physical and mental
health constructs. Med Care 1993; 31: 247–263.
18. Sofres Metascope panel.
19. Hays RD, Hayashi T, Carson S, Ware JE. User’s Guide for
the Multitrait Analysis Program (MAP) Version 2. Santa
Monica, CA: Rand Corporation; 1988.
20. Hays RD, Hayashi T. Beyond internal consistency: Rationale
and user’s guide for Multitrait Analysis Program on the microcomputer. Behav Res Methods, Instrument Computers
1990; 22: 167.
21. Ware JE, Harris WJ, Gandek B, Rogers BW, Reese PR.
MAP-R for Windows: Multitrait/Multi-Item Analysis Program—Revised. Boston, MA: Health Assessment Lab; 1997.


22. Leplège A, Bucquet D. Translation, linguistic validation, and
preliminary psychometric assessment of the SF-36 in French
within the IQOLA Project. Qual Life Res 1994; 3: 59.
23. Leplège A, Mesbah M, Marquis P. Analyse préliminaire des
propriétés psychométriques de la version Française d’un questionnaire international de mesure de qualité de la vie: le MOS
SF-36 (version 1.1) [Preliminary psychometric analysis of the
French version of an international quality of life questionnaire: the MOS SF-36 (version 1.1)] Rev Epidemiol Sante
Publique 1995; 43: 371–379.
24. McHorney CA, Ware JE, Lu JFR, Sherbourne CD. The MOS
36-Item Short-Form Health Survey (SF-36): III. Tests of data
quality, scaling assumptions and reliability across diverse
patient groups. Med Care 1994; 32: 40–66.


A. Leplège et al.

APPENDIX 1. Source items and French translation


Vigorous activities, such as running, lifting heavy objects,
participating in strenuous sports
Moderate activities, such as moving a table, pushing a
vacuum cleaner, bowling or playing golf
Lifting or carrying groceries
Climbing several flights of stairs
Climbing one flight of stairs
Bending, kneeling, or stooping
Walking more than a mile
Walking several blocks
Walking one block
Bathing or dressing yourself
Cut down the amount of time you spent on work or other
Accomplished less than you would like


Were limited in the kind of work or other activities
Had difficulty performing the work or other activities (for
example, it took extra effort)


How much bodily pain have you had during the past 4
During the past 4 weeks, how much did pain interfere with
your normal work (including both work outside the home
and housework?)
During the past 4 weeks, to what extent has your physical
health or emotional problems interfered with your normal
activities with family, friends, neighbors, or groups?



During the past 4 weeks, how much of the time has your
physical health or emotional problems interfered with
your social activities (like visiting friends, relatives, . . .)



Have you been a very nervous person?
Have you felt so down in the dumps that nothing could
cheer you up?
Have you felt calm and peaceful?
Have you felt downhearted and blue?
Have you been a happy person?
Cut down the amount of time you spent on work or other
Accomplished less than you would like


Didn’t do work or other activities as carefully as usual


Did you feel full of pep?
Did you have a lot of energy?
Did you feel worn out?
Did you feel tired?
In general, would you say your health is:
I seem to get sick a little easier than other people
I am as healthy as anybody I know
I expect my health to get worse
My health is excellent
Compared to one year ago, how would you rate your health
in general now?


Copyright © 1993 Health Assessment Log. All rights reserved.

Efforts physiques importants tels que courir, soulever un objet
lourd, faire du sport, etc.
Efforts physique modérés tels que déplacer une table, passer
l’aspirateur, jouer aux boules, etc.
Soulever et porter les courses
Monter plusiers étages par l’escalier
Monter un étage par l’escalier
Se pencher en avant, se mettre à genoux, s’accroupir
Marcher plus d’un kilomètre à pied
Marcher plusieurs centaines de mètres
Marcher une centaine de mètres
Prendre un bain, une douche ou s’habiller
Avez-vous réduit le temps passé à votre travail ou à vos
activité habituelles?
Avez-vous accompli moins de chose que ce que vous auriez
Avez-vous du arrêter de faire certaines choses?
Avez-vous eu des difficultés à faire votre travail ou toute autre
activité? (Par exemple, cela vous a demandé un effort
Au cours de ces 4 dernières semaines, quelle a été l’intensité
de vos douleurs physiques?
Au cours de ces 4 dernières semaines, dans quelle mesure vos
douleurs physiques vous ont-elles limité(e) dans votre
travail ou vos activités domestiques?
Au cours de ces 4 dernières semaines, dans quelle mesure estce que votre état de santé, physique ou émotionnelle, vous
à gêné dans votre vie et vos relations avec les autres, votre
famille, vos amis, vos connaissances?
Au cours de ces 4 dernières semaines, y a-t-il eu des moments
où votre état de santé, physique ou émotionnelle, vous a
gêné dans votre vie ou vos relations avec les autres, votre
famille, vos amis, vos connaissances?
Vous vous êtes senti(e) très nerveux(se)?
Vous vous êtes senti(e) si découragé(e) que rien ne pouvait
vous remonter le moral?
Vous vous êtes senti(e) calme et détendu(e)?
Vous vous êtes senti(e) triste et abattu(e)?
Vous vous êtes senti(e) bien dans votre peau?
Avez-vous réduit le temps passé à votre travail ou à vos
activités habituelles
Avez-vous fait moins de choses que ce que vous auriez
Avez-vous eu des difficultés à faire ce que vous aviez à faire
avec autant de soin et d’attention
Vous vous êtes senti(e) dynamique?
Vous vous êtes senti(e) dédordant(e) d’énergie?
Vous vous êtes senti(e) épuisé(e)?
Vous vous êtes senti(e) fatigué(e)?
Dans l’ensemble, pensez-vous que votre santé est:
Je tombe malade plus facilement que les autres
Je me porte aussi bien que n’importe qui
Je m’attend à ce que ma santé se dégrade
Je suis en parfaite santé
Par rapport à l’année dernière à la même époque, comment
trouvez-vous votre état de santé en ce moment?

The French SF-36 Health Survey


APPENDIX 2. Source response choices and French translation




PF1, PF2, PF3, PF4,
PF5, PF6, PF7,
PF8, PF9, PF10

Yes, limited a lot
Yes, limited a little
No, not limited at all

Oui, beaucoup limité(e)
Oui, un peu limité(e)
Non, pas du tout limité(e)

RP1, RP2, RP3, RP4,
RE1, RE2, RE3




Very mild
Very severe

Très faible
Très grande

BP2, SF1

Not at all
A little bit
Quite a bit

Pas du tout
Un petit peu


All of the time
Most of the time
Some of the time
A little of the time
None of the time

En permanence
Une bonne partie du temps
De temps en temps

MH1, MH2, MH3,
MH4, MH5, VT1,
VT2, VT3, VT4

All of the time
Most of the time
A good bit of the time
Some of the time
A little of the time
None of the time

En permanence
Très souvent


Very good

Très bonne

GH2, GH3, GH4,

Definitely true
Mostly true
Not sure
Mostly false
Definitely false

Totalement vraie
Plutôt vraie
Je ne sais pas
Plutôt fausse
Totalement fausse


Much better now than one year ago
Somewhat better now than one year ago
About the same
Somewhat worse now than one year ago
Much worse now than one year ago

Bien meilleur que l’an dernier
Plutôt meilleur
A peu près pareil
Plutôt moins bon
Beaucoup moins bon

Copyright © 1993 Health Assessment Log. All rights reserved.

Aperçu du document SF_36_-_qualite_de_vie.pdf - page 1/11
SF_36_-_qualite_de_vie.pdf - page 2/11
SF_36_-_qualite_de_vie.pdf - page 3/11
SF_36_-_qualite_de_vie.pdf - page 4/11
SF_36_-_qualite_de_vie.pdf - page 5/11
SF_36_-_qualite_de_vie.pdf - page 6/11

Télécharger le fichier (PDF)

SF_36_-_qualite_de_vie.pdf (PDF, 177 Ko)

Formats alternatifs: ZIP

Documents similaires

sf 36 qualite de vie
cbs s pdf
driving cessation in older adults
fpsyt 08 00290
10 1007 s11199 008 9535 y
investigation sur les raisons individus a devenir vegan

Sur le même sujet..