BenFares Nolay .pdf
À propos / Télécharger Aperçu
Ce document au format PDF 1.3 a été généré par Word / Mac OS X 10.10.5 Quartz PDFContext, et a été envoyé sur fichier-pdf.fr le 11/07/2018 à 15:58, depuis l'adresse IP 90.83.x.x.
La présente page de téléchargement du fichier a été vue 271 fois.
Taille du document: 1.4 Mo (31 pages).
Confidentialité: fichier public
Aperçu du document
Project in Applied Statistics and Econometrics:
What is the impact of immigration on
wage level in France?
BEN FARES Mouad
NOLAY Matthieu
2015
M1 Aix Marseille School of Economics
Mr. SEVESTRE Patrick
1
I)
Introduction:
Immigration is a very old phenomenon. Thousands of years ago, people were already
moving from one land to another in order to flight the persecution of another
population, research a better place to live, discover new land etc… Today, immigration is
more newsworthy than ever since immigration is facilitated by the great progress of the
transportation industry, the globalization and the borders opening. But the reasons have
not really changed often immigration is guided by the hope of a better life, by escaping
from war, persecution (religious, ethnic, etc…) and numerous other reasons.
In France, the immigration phenomena increase after the revolution. France becomes a
host country at the same time that a land of liberty and right for all. But it’s in the last
century that immigration accelerates. After the First World War, France is destroyed and
had lost a great part of his work force. In order to rebuild the country quickly, a mass
immigration is organized. The same mechanism will be used after World War 2.
During the thirty year of post-‐war economic growth, immigration is not a problem, it’s
the beginning of the consumerism society and industries are thriving. But lately the
economic situation had known more recession than durable growth and immigration is
less and less welcomed in most of developed countries.
Immigration policies and issues are often at the heart of debates. Government,
candidates, and electors take it very seriously. Many opinions diverge about
immigration. Some claiming that immigration has positive effects on the country’s
economic situation and growth. Other, asserting that immigrant worsens the country’s
labor market by decreasing the wage level and increasing unemployment for native
workers. The rest think that immigration has a small impact on labor market maybe
even insignificant, or they have no opinion on the matter.
Lately, a lot of studies tried to unveil this complex phenomenon. But here too, the
conclusions are not always similar.
2
Normally, if you just consider the law of supply and demand, an arrival of immigrants on
the labor force market should increase the supply of labor and lower the price and in
this case the wage level. At least, this is true if you observe comparable segments of
population.
Figure 1: The law of supply and demand.
I-‐
We can observe the impact on prices of an increase in demand (price goes up).
II-‐
We can observe the impact on prices of an increase in supply (price goes down). This graph is
the one that represent theoretically immigration by the law of demand and supply.
Indeed, an important parameter to consider for studying the impact of immigration on
wage level is that the effect of immigration can be measured only at equivalent level of
skills, experience, study, sector and profession.
Empirically, studies made before the 21th century have encounter some difficulties to
prove a sizable negative effect on wage level as expected by the theoretical implication
of supply and demand law. For example, Friedberg and Hunt (1995), Smith and
Edmonston (1997), Borjas (1994), and Lalonde and Topel (1996) all concluded that the
impact of immigration on wage level of native workers is small. These studies were
exploiting the geographical clustering of immigrants, separating them into different local
labor market in order to identify the effect on wage level. as have said Carrasco (2008) :
“Most of the previous studies suggest that, at most, a 10% increase in the fraction of
immigrants reduces the wages of native workers by about 1%.”
3
Borjas (2003) criticize this approach. He explains that the problem of these studies is
that they ignore “the strong economics currents that tend to equalize economic
conditions across cities and regions”1. He proposes a new approach based on the Human
Capital theory. He argues that: “by paying closer attention to the characteristics that
define a skill group one can make substantial progress in determining whether
immigration influences the employment and earnings opportunities of native workers”2.
Borjas (2003) used an approach focusing on correlations across skill groups (using
education and labor market as indicators of skills) and with this approach, Borjas (2003)
found that a 10% increase in a skill group lowers that wage of that group by 2 to 3%.
Nevertheless, during the last decade, a big number of immigrants have arrived to
Europe; As a result, many European countries have received immigrants coming from all
over the world, (especially from North Africa for France and Spain). Thus, the need of
studies analyzing the impact of immigration on wages of natives’ workers in many
European countries has notably increased.
The empirical analysis uses the data of the annual “Enquête Emploi” conducted in 2012
by l’INSEE (Institut National de la Statistique ET des Etudes Economiques).
4
II)
Data
We use in this study the empirical database “enquête emploi 2012”13 (labor force survey
2012) by INSEE. This panel study furnishes evidence data about profession, activity per
sex, age, and nationality, working hours, type of employment and contracts, and wages.
The study was realized over the full year of 2012 by interrogating several times the
individuals concerned. The only restriction of this study is that the people interrogated
must be more than 15 years old.
The labor force survey contains over 560 variables and about 422 000 observations. We
choose only 46 variables in order to conduct our study (details about chosen variables in
the appendices).
The dependent variable
In order to measure the impact of immigrants on the wage level of native workers, our
dependent variable will be of course the wage of native workers. To keep only the wage
level of native’s workers we discarded all immigrants’ wage in our database.
We think that it will be more significant to focus on hourly wage to measure the impact
of immigration. We decide also to use a logarithm on our dependent variable, because
we want to show how the variables selected in the model explain the wage level and
more especially if the variable immigration have a strong impact or not. The log will
allow us to make statements about the impact of the explanatory variables more easily.
We define the variable “hourly_wage” and create “logW” as follow:
𝐻𝑜𝑢𝑟𝑙𝑦𝑤𝑎𝑔𝑒 =
𝑠𝑎𝑙𝑚𝑒𝑒
𝑛𝑏ℎ𝑒𝑢𝑟
𝐿𝑜𝑔𝑊 = log (ℎ𝑜𝑢𝑟𝑙𝑦𝑤𝑎𝑔𝑒)
The explanatory variables
Education level:
The education level is an important explanatory variable in order to define the wage
level of an individual. The Human Capital Theory emphasizes that education and
experience are the two main variables that explain the wage of an individual. It’s pretty
5
obvious that different level of education will conduct to different wage at the beginning
of an individual’s career.
In the “Enquête emploi 2012” the education level is define in 11 different categories:
-‐
71-‐ Without diploma
-‐
70-‐ “Certificat d’études primaires”
-‐
60-‐ “Brevet des colleges”
-‐
50-‐ CAP/BEP or equivalence
-‐
41-‐ Technologic Baccalaureate
-‐
42-‐ General Baccalaureate
-‐
33-‐ Paramedical and social (Bac+2)
-‐
31-‐ BTS/DUT or equivalence
-‐
30-‐ DEUG
-‐
11-‐ School level bachelor or more
-‐
10-‐ Bachelor and more (until Phd)
We built 11 dummies variables to represent each one of the above categories. We named
the dummies as follows Dip_71 (Without diploma), Dip_70 (Certificat d’études
primaires), Dip_60 (Brevet des colleges), Dip_50 (CAP/BEP or equivalence), Dip_41
(Technologic Baccalaureate), Dip_42 (General Baccalaureate), Dip_33 (Paramedical and
social), Dip_31 (BTS/DUT or equivalence), Dip_30 (DEUG), Dip_11 (School level bachelor
or more) and Dip_10 (Bachelor and more).
The experience
According to the Human Capital Theory, the experience is, as education, a strong
determinant of the wage of an individual. Workers acquire skills, competences, and
productivity with experience; those qualities will allow them to increase their wage.
We computed the work experience (exper) by subtracting the graduation date to the
year of the survey. If the individual has no diploma we calculated experience by
subtracting 16 to the age because we didn’t had data on the graduation’s date obviously.
𝑒𝑥𝑝𝑒𝑟 = 𝑎𝑛𝑛𝑒𝑒𝑛𝑢𝑚 − 𝑑𝑎𝑡𝑑𝑖𝑝𝑛𝑢𝑚
if Dip11 = 71 then exper = age_num -‐ 16
6
The immigration:
Immigration is the variable we are interested in. It’s its impact on wage that we want to
capture. As Borjas (2003) explain, the immigration will not affect the all labor force but
small labor force with correspondent level of education. We think also that the sector
and the function are important to define small labor forces.
First we try to express immigrants as dummy variable in function of experience and
level of education but we couldn’t get rid of multicollinearity problems. So we define
apart the level of education as dummy and immigrants from their sector of activity and
their profession.
So we cross the number of immigrant’s by their sector (variable NAFG17N1) and their
profession (variable QPRC2). We create a new variable by taking the mean of this
crossing3.
1 See in the variables tables in the appendix
2 See in the variables tables in the appendix
3 Tables in the appendix
7
III)
The model
In order to study the impact of immigration on the level of wages in France, we focus on
the Human Capital Theory to define the wage level (experience and education as main
variables. And we add a variable accounting for the number of immigrants per sector
and profession to observe the impact of immigration on the wage level.
Log (HW) = B1*Exper + B2*Dip_71 + B3*Dip_70 + B4*Dip_60 + B5*Dip_50 + B6*Dip_42
+ B7*Dip_41 + B8*Dip_33 + B9*Dip_31 + B10*Dip_30 + B11*Dip_11 + B12*Dip_10 +
B13*Part_des_Immi + U
For this model we made the basic assumption under which OLS is the Best Linear
Unbiased Estimator (BLUE):
-‐ The model is correctly specified.
-‐ E (U) =0.
-‐ V (U) = σ2
-‐ Explanatory variables are not random.
-‐ Rank of X (the matrix of explanatory variables) = k (and not k+1 because we
have no intercept in the model)
After doing all the test (Haussman, Sargan, White) we end up with the following
assumptions:
-‐
The model is correctly specified
-‐
E (U) = 0
-‐
Cov (X/U) ≠ 0
-‐
V (U) = Ω2
-‐
Explanatory variables are not random
-‐
Rank of X (the matrix of explanatory variables) = k
With this set of assumptions the best estimator is GMM that is why we will keep as a
final result the estimate coefficients of this regression.
8
• First regression: OLS (Ordinary Least Squares)
Variables
Coefficients
Estimates
Standard Error
P-‐value
Exper
0.009827
0.000167
<.0001
Dip_71
2.064727
0.00855
<.0001
Dip_70
1.975084
0.0133
<.0001
Dip_60
2.134281
0.00894
<.0001
Dip_50
2.16528
0.00674
<.0001
Dip_42
2.253148
0.00670
<.0001
Dip_41
2.289308
0.00840
<.0001
Dip_33
2.452845
0.0103
<.0001
Dip_31
2.414604
0.00661
<.0001
Dip_30
2.421689
0.0182
<.0001
Dip_11
2.790126
0.0116
<.0001
Dip_10
2.572066
0.00643
<.0001
part_des_immi
-0.99426
0.0375
<.0001
The results obtained seem to be logical.
We have growing coefficients with respect to the level of education, which was expected.
The parameter related to experience is very low which is in contradiction with the
Human Capital Theory, which describe this variable as an important factor to explain the
wage’s level. For the immigration parameter as expected, we have a negative coefficient.
All the estimators are unbiased with a minimal variance. However, we will introduce an
instrument for each variable to check the exogeneity/endogeneity of our model. Indeed,
if the variables that we use are endogenous, the coefficients obtained with OLS will not
be consistent. That’s why we have done a second regression with 2SLS.
9
• Second regression: 2SLS (2 Stages Least Squares)
Variables
Coefficients
Estimates
Standard Error
P-‐value
Exper
0.006906
0.00526
0.1896
Dip_71
2.131593
0.1675
<.0001
Dip_70
2.888211
1.5523
0.0628
Dip_60
3.027611
0.6177
<.0001
Dip_50
2.064905
0.2512
<.0001
Dip_42
2.873439
0.5882
<.0001
Dip_41
1.65675
0.5630
0.0033
Dip_33
0.828759
1.1705
0.4789
Dip_31
2.184663
0.5411
<.0001
Dip_30
1.371147
1.9254
0.4764
Dip_11
3.0817
0.9469
0.0011
Dip_10
3.414592
0.4217
<.0001
part_des_immi
-1.61895
0.5787
0.0052
In order to achieve a regression with 2SLS we had to find instruments for every
potentially endogenous variable.
We choose the square of the experience (Exper2) as an instrumental variable to test the
significance of the experience, and cspp and cspm to test the level of education (with
cspp: “la catégorie socio-‐professionnelle du père” and cspm: “la catégorie socio-‐
professionnelle de la mère). We created dummies for every category of workers (1 to 6)
in CSPP and CSPM.
In this second regression with instrumental variables, we computed the Haussman
statistic to test the exogeneity/endogeneity of our model. We obtained the following
Table on SAS:
10
Hausman's Specification Test Results
Efficace sous H0 Cohérent sous H1 DDL Statistique Pr > Khi-2
OLS
2SLS
13
29.25
0.0060
The statistics obtain is 29,25 which has to be compare with the Khi2 at 13 degrees of
freedom (correspond to the number of variables in the model):
29,25 > 22,36
𝑄! > 𝐾ℎ𝑖 ! 𝑤𝑖𝑡ℎ 5% 𝑜𝑓 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
So giving this results we reject H0 (exogeneity of the model), and we can conclude that
our model is endogenous. With endogeneity, the estimation by OLS is biased and non
consistent. The estimation by 2SLS will be biased but consistent.
In order to confirm the validity of this estimate we need to check the
homoscedasticity/heteroskedasticity of our model. To do so we will introduce the GMM
estimation.
11
Third regression: GMM (Generalized Method of Moments)
Coefficients
Variables
Estimates
Standard Error
P-‐value
Exper
0.007237
0.00496
0.1447
Dip_71
2.135492
0.1586
<.0001
Dip_70
2.664466
1.4542
0.0669
Dip_60
3.000466
0.5650
<.0001
Dip_50
2.089233
0.2318
<.0001
Dip_42
2.82725
0.5686
<.0001
Dip_41
1.683617
0.5657
0.0029
Dip_33
0.899214
1.2068
0.4562
Dip_31
2.176736
0.5394
<.0001
Dip_30
1.927076
1.8335
0.2933
Dip_11
3.266532
1.0005
0.0011
Dip_10
3.28864
0.4126
<.0001
part_des_immi
-1.62403
0.6138
0.0081
The results obtain with GMM are close to those obtain with 2SLS.
We use the same instruments to estimate GMM than those use to estimate 2SLS.
We conducted the White Test (homoscedasticity/heteroskedasticity) with the following
result’s table on SAS:
Heteroscedasticity Test
Equation Test
logw
White's Test
Statistique DDL Pr > Khi-2 Variables
18594
35
<.0001 Cross of all vars
The statistic obtain for the White test is 18594 which has to be compared with the Khi2
at 35 degrees of freedom:
12
18594 > 49,80
𝑄! > 𝑘ℎ𝑖 ! 𝑤ℎ𝑖𝑡ℎ 5% 𝑜𝑓 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
Giving this result we can conclude that we have heteroskedasticity. With this conclusion
the better estimator for our model will be GMM, so we will keep the estimation by GMM
as our final results.
To finish we conducted the Hansen/Sargan test to check the validity of our instruments.
We obtain the following statistics on SAS:
GMM Test Statistics
Test
DDL Statistic
Overidentifying Restrictions
4
Prob
3.17 0.5298
The statistic obtain is 3,17 which has to be compared with the Khi2 at 4 degrees of
freedom (P-‐(K+1) => where P is the number of instrumental variables in our case 16, k
is the number of endogenous variables in our case 12 we forget about the +1 because we
have no intercept in our model):
3,17 < 9,49
𝑄! = 𝐾ℎ𝑖 ! (𝑤𝑖𝑡ℎ 5 % 𝑜𝑓 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛)
Giving this result, we can confirm our instruments as efficient.
13
IV)
V. Results
First of all, we can remark that we loose precision on most of our coefficients for the
regression by 2SLS and GMM compare to OLS.
The coefficients relative to Dip_70, Dip_33, Dip_30 are no more significant. We suppose
that this result comes from the fact that each of these categories represents less than 5%
of the total observations4.
The coefficient relative to experience is also insignificant. We suppose that is due to the
fact that we didn’t create categories for the different level of experience. In the
experience table5, you can see that we have a standard error of 12,37, which mean that
the variance is very strong (≈144). We suppose that is the reason why we have non-‐
significant estimate for the experience coefficient.
About the interpretation of our coefficients we cannot conclude the real impact on wage
because with a semi-‐log model as the one we use, we are suppose to be able to say that
an increase of 1 units in our variables will increase/decrease the wage by B%. However,
except the coefficient relative to experience, which is insignificant, all our coefficients
are superior to one.
But the correlation between our coefficients relative to the diploma’s level and the wage
level is positive. The education level seems to affect the wage level positively. The more
important coefficient is the ones relative to the highest level of diploma Dip_10 and
Dip_11 (Bachelor and more).
We can at least conclude that we found a negative relation between native’s wage level
and the number of immigrants respectively to the activity’s sector and profession.
Unfortunately, we can’t quantify this effect with certainty.
4 See the table relative to dip11
5 See the table relative to experience
14
V)
Conclusion
The model we constructed is probably too simple to have significant and interpretable
result on the impact of immigration.
Nevertheless, we can draw the tendencies of the impact of the variables in our model on
wage level.
-‐
The level of education has a positive effect on wage level. The highest level of
education has the strongest impact on wage level.
-‐
The number of immigrants relative to the particular activity’s sector of a native
worsens his wage level since we found a negative relation between
part_des_immi and logw.
The Human Capital Theory and the law of supply and demand implied these kinds of
results.
15
References
Borjas, George J. "The Labor Demand Curve is Downward Sloping: Reexamining the
Impact of Immigration on the Labor Market," Quarterly Journal of Economics 118(4):
1335-‐1374, November 2003
Carasco Raquel, Jimeno Juan F, and Ortega Carolina A. « The effect of immigration on the
labor market performance of native-‐born workers : some evidence for Spain, JEL
classification J21.J11, 2008.
Ciaran Devlin and Olivia Bolt, & Dhiren Patel, David Harding and Ishtiaq Hussain,
« Impacts of migration on UK native employment: An analytical review of the evidence »,
Home Office & Department for Business, Innovation and Skills, 6 Mars 2014
Gianmarco I. P. Ottaviano & Giovanni Peri, 2012. "Rethinking The Effect Of Immigration
On Wages," Journal of the European Economic Association, European Economic
Association, vol. 10(1), pages 152-‐197, 02.
16
Appendices 1: Description of the variables in French (Variables
table)
N°
Variable
Nature
Size
Libellé
3
AGE
Char
3
Age détaillé au dernier jour de la semaine
de référence
5
ANCENTR
Num
8
Ancienneté dans l’entreprise ou dans la
fonction publique (en mois)
6
ANNEE
Char
4
Année de l’enquête
7
CSPM
Char
2
Catégorie Socioprofessionnelle de la mère
8
CSPP
Char
2
Catégorie Socioprofessionnel du père
Modalités
14 à 98 – Age détaillé
99-‐99 ans et plus
Sans Objet (ACF=’2’, ‘3’) ou non renseigné
0 à 60 -‐ Nombre de mois
Plus de 60 -‐ Nombre de mois entre l’année d’entrée et
l’année de collecte
2012 – Année de l’enquête
00 -‐ Inconnue
10 -‐ Agriculteurs
21 -‐ Artisans
22 -‐ Commerçants et assimilés
23 -‐ Chefs d'entreprise de 10 salariés ou plus
31 -‐ Professions libérales
33 -‐ Cadres de la fonction publique
34 -‐ Professeurs, professions scientifiques
35 -‐ Professions de l'information, des arts et des spectacles
37 -‐ Cadres administratifs et commerciaux d'entreprises
38 -‐ Ingénieurs et cadres techniques d'entreprises
42 -‐ Professeurs des écoles, instituteurs et professions
assimilées
43 -‐ Professions intermédiaires de la santé et du travail
social
44 -‐ Clergé, religieux
45 -‐ Professions intermédiaires administratives de la
fonction publique
46 -‐ Professions intermédiaires administratives et
commerciales des entreprises
47 -‐ Techniciens
48 -‐ Contremaîtres, agents de maîtrise
52 -‐ Employés civils et agents de service de la fonction
publique
53 -‐ Policiers et militaires
54 -‐ Employés administratifs d'entreprise
55 -‐ Employés de commerce
56 -‐ Personnels des services directs aux particuliers
62 -‐ Ouvriers qualifiés de type industriel
63 -‐ Ouvriers qualifiés de type artisanal
64 -‐ Chauffeurs
65 -‐ Ouvriers qualifiés de la manutention, du magasinage et
du transport
67 -‐ Ouvriers non qualifiés de type industriel
68 -‐ Ouvriers non qualifiés de type artisanal
69 -‐ Ouvriers agricoles et assimilés
71 -‐ Anciens agriculteurs exploitants
72 -‐ Anciens artisans, commerçants, chefs d'entreprise
74 -‐ Anciens cadres
75 -‐ Anciennes professions intermédiaires
77 -‐ Anciens employés
78 -‐ Anciens ouvriers
81 -‐ Chômeurs n'ayant jamais travaillé
82 -‐ Inactifs divers (autres que retraités)
00 -‐ Inconnue
10 -‐ Agriculteurs
21 -‐ Artisans
22 -‐ Commerçants et assimilés
23 -‐ Chefs d'entreprise de 10 salariés ou plus
31 -‐ Professions libérales
17
33 -‐ Cadres de la fonction publique
34 -‐ Professeurs, professions scientifiques
35 -‐ Professions de l'information, des arts et des spectacles
37 -‐ Cadres administratifs et commerciaux d'entreprises
38 -‐ Ingénieurs et cadres techniques d'entreprises
42 -‐ Professeurs des écoles, instituteurs et professions
assimilées
43 -‐ Professions intermédiaires de la santé et du travail
social
44 -‐ Clergé, religieux
45 -‐ Professions intermédiaires administratives de la
fonction publique
46 -‐ Professions intermédiaires administratives et
commerciales des entreprises
47 -‐ Techniciens
48 -‐ Contremaîtres, agents de maîtrise
52 -‐ Employés civils et agents de service de la fonction
publique
53 -‐ Policiers et militaires
54 -‐ Employés administratifs d'entreprise
55 -‐ Employés de commerce
56 -‐ Personnels des services directs aux particuliers
62 -‐ Ouvriers qualifiés de type industriel
63 -‐ Ouvriers qualifiés de type artisanal
64 -‐ Chauffeurs
65 -‐ Ouvriers qualifiés de la manutention, du magasinage et
du transport
67 -‐ Ouvriers non qualifiés de type industriel
68 -‐ Ouvriers non qualifiés de type artisanal
69 -‐ Ouvriers agricoles et assimilés
71 -‐ Anciens agriculteurs exploitants
72 -‐ Anciens artisans, commerçants, chefs d'entreprise
74 -‐ Anciens cadres
75 -‐ Anciennes professions intermédiaires
77 -‐ Anciens employés
78 -‐ Anciens ouvriers
81 -‐ Chômeurs n'ayant jamais travaillé
82 -‐ Inactifs divers (autres que retraités)
9
DATDIP
Char
4
Année d’obtention du plus haut diplôme
1900 à 2012 -‐ Année
-‐Non renseigné
10 – Licence (L3), Maitrise (M1), Master (recherche ou
professionnel), DEA, DESS, Doctorat
11 – Ecoles niveau licence et au-‐delà
30 – DEUG
31 – BTS, DUT ou équivalent
33 – Paramédical et social (niveau bac+2)
41 – Baccalauréat général
42 – Bac technologique, bac professionnel ou équivalents
50 – CAP, BEP ou équivalents
60 – Brevet des collèges
70 – Certificat d’études primaires
71 – Sans diplôme
10
DIP11
Char
2
Diplôme le plus élevé obtenu (11 postes)
11
EXTRI13
Num
8
Coefficient de pondération pour les
individus calé sur les estimations
démographiques de 2013
Coefficient numériques sur 8 caractères
12
IDENT
Char
8
Identifiant anonymisé du logement
Identifiant sur 7 positions
13
IMMI
Char
1
Etre immigré
14
NAFG17N
Char
2
Activité de l'établissement actuel (NAF
rév2 en 17 postes)
1 – Oui
2 -‐ Non
-‐ Sans objet
AZ -‐ Agriculture, sylviculture et pêche
C1 -‐ Fabrication de denrées alimentaires, de boissons et de
produits à base de tabac
C2 -‐ Cokéfaction et raffinage
C3 -‐ Fabrication d'équipements électriques, électroniques,
informatiques; fabrication de
machines
C4 -‐ Fabrication de matériels de transport
C5 -‐ Fabrication d'autres produits industriels
DE -‐ Industries extractives, énergie, eau, gestion des déchets
18
et dépollution
FZ -‐ Construction
GZ -‐ Commerce ; réparation d'automobiles et de motocycles
HZ -‐ Transports et entreposage
IZ -‐ Hébergement et restauration
JZ -‐ Information et communication
KZ -‐ Activités financières et d'assurance
LZ -‐ Activités immobilières
MN -‐ Activités scientifiques et techniques ; services
administratifs et de soutien
OQ -‐ Administration publique, enseignement, santé humaine
et action sociale
RU -‐ Autres activités de services
00 -‐ Non renseigné
15
NBHEUR
Num
8
16
NOI
Char
2
17
QPRC
18
Char
1
RGI
Char
1
19
SALMEE
Num
8
20
SALRED
Num
8
-‐ Sans objet (en interrogation intermédiaire) ou non déclaré
Nombre d'heures correspondant au salaire 1 à 250 -‐ Nombre d'heures correspondant au salaire
déclaré
Numéro individuel d’identification
Classement de l’emploi principal salarié
Rang d’interrogation de l’individu
Salaire mensuel déclaré de l’emploi
principal (y compris primes et
compléments mensuels)
Salaire mensuel net redressé des non
réponses (y compris les primes
mensualisées et redressées des non
réponses)
21
SALSEE
Num
8
Salaire retiré des activités secondaires
(brut ou net)
22
SEXE
Char
1
Sexe
23
STAT2
Char
2
Statut (Salarié, Non Salarié) mis en
cohérence avec la profession
24
TOTNBH
Char
5
Nombre d'heures total effectuées la
semaine de référence sur l'ensemble des
emplois
25
TRIM
Char
1
Trimestre de l'enquête
01 à 15 – Numéro d’ordre de la personne dans le logement
-‐Sans objet (ACTOP=’2’) ou non renseigné
1 – Manœuvre ou ouvrier spécialisé
2 – Ouvrier qualifié ou hautement qualifié
3 – technicien
4 – Employé de bureau, de commerce, personnel de
services, personnel de catégorie C ou D
5 – Agent de maîtrise, maîtrise administrative ou
commerciale VRP (non cadre), personnel de catégorie B
7 – Ingénieur, cadre (à l’exception des directeurs généraux
ou de ses adjoints directs), personnel de catégorie A
8 – Directeur général, adjoint direct
9 -‐ Autre
1 -‐ 1ère interrogation de l'individu
2 -‐ 2ème interrogation de l'individu
3 -‐ 3ème interrogation de l'individu
4 -‐ 4ème interrogation de l'individu
5 -‐ 5ème interrogation de l'individu
6 -‐ 6ème interrogation de l'individu
-‐Sans objet (interrogation intermédiaire) ou refus
0 à 999999 – Montant en euros
-‐Sans objet (interrogation intermédiaire)
0 à 999999 – Montant en euros
-‐ Refus ou ne sait pas ou sans objet (interrogation
intermédiaire)
0 à 999999 -‐ Montant en euros
1 – Masculin
2 -‐ Féminin
Sans Objet (ACT=’2’, ‘3’)
1 -‐ Non Salarié
2 -‐ Salarié
-‐ Sans objet
0.00 à 99.59 -‐ Nombre d'heures
1 à 4 -‐ 1er au 4ème trimestre de l'année
19
Appendices 2: Code SAS:
libname
sortie
results'; run;
'C:\Users\Lenovo\Desktop\AMSE\ECONOMETRICS
project\SAS
%macro import_dbf_file(file_to_import, sas_data_file);
proc import datafile=&file_to_import
out= sortie.&sas_data_file
dbms=dbf replace;
run;
%mend import_dbf_file;
%import_dbf_file("C:\Users\Lenovo\Desktop\AMSE\ECONOMETRICS
project\project_data\indiv122.dbf",ee122);
%import_dbf_file("C:\Users\Lenovo\Desktop\AMSE\ECONOMETRICS
project\project_data\indiv121.dbf",ee121);
%import_dbf_file("C:\Users\Lenovo\Desktop\AMSE\ECONOMETRICS
project\project_data\indiv123.dbf",ee123);
*Select useful variables;
data ee122_clean; set sortie.ee121(keep=AGE ANCENTR ANNEE CSPM CSPP DATDIP
DIP11 EXTRI13 IDENT NOI TRIM);
data ee121_clean; set sortie.ee122(keep=EXTRI13 IDENT IMMI NAFG17N NBHEUR
NOI TRIM);
data ee123_clean; set sortie.ee123(keep=EXTRI13 IDENT QPRC NOI RGI SALMEE
SEXE STAT2 TOTNBH TRIM);
proc contents data=ee121_clean; run;
proc contents data=ee122_clean; run;
proc contents data=ee123_clean; run;
*we merge
proc sort
proc sort
proc sort
the satasets i a unique file ;
data=ee121_clean; by ident noi trim; run;
data=ee122_clean; by ident noi trim; run;
data=ee123_clean; by ident noi trim; run;
data ee_all_clean; merge ee121_clean ee122_clean ee123_clean; by ident noi
trim; run;
proc contents data=ee_all_clean; run;
data ee_all_clean; set ee_all_clean;
*We convert some variables from (caractère) to (numérique);
annee_num= input(annee,4.);
age_num = input(age,2.);
datdip_num = input(datdip,4.);
totnbh_num = input(totnbh,2.);
trim_num = input(trim,1.);
*create variable for native wage;
if immi=1 then salmee=.;
hourly_wage=salmee/nbheur;
logw=log(hourly_wage);
*Definition of experience since school leaving and seniority + instrumental
variable exper**2;
exper=annee_num-datdip_num; if dip11='71' then exper=age_num-16;
exper2 = exper**2;
seniority=int(ancentr/12);
20
*The difference between experience and seniority should be positive;
dif_exp_sen=exper-seniority;
*Creation of dummy for observation relative to an individual;
lag1_ok=(ident=lag(ident)&
naia=lag(naia)
&
sexe=lag(sexe)
trim=lag(trim)+1);
&
* Creation of dummies corresponding to the level of education;
Dip_70=(dip11=70);
Dip_71=(dip11=71);
Dip_50=(dip11=50);
Dip_60=(dip11=60);
Dip_41=(dip11=41);
Dip_42=(dip11=42);
Dip_30=(dip11=30);
Dip_31=(dip11=31);
Dip_33=(dip11=33);
Dip_10=(dip11=10);
Dip_11=(dip11=11);
*Creation of instruments;
cspm_00=(cspm=00);
cspm_01=(cspm=10);
cspm_02=(cspm=21 & 22 & 23);
cspm_03=(cspm=31 & 33 & 34 &
cspm_04=(cspm=42 & 43 & 44 &
cspm_05=(cspm=52 & 53 & 54 &
cspm_06=(cspm=62 & 63 & 64 &
cspp_00=(cspp=00);
cspp_01=(cspp=10);
cspp_02=(cspp=21 &
cspp_03=(cspp=31 &
cspp_04=(cspp=42 &
cspp_05=(cspp=52 &
cspp_06=(cspp=62 &
run;
22
33
43
53
63
&
&
&
&
&
23);
34 &
44 &
54 &
64 &
35
45
55
65
&
&
&
&
37 & 38);
46 & 47 & 48);
56);
67 & 68 & 69);
35
45
55
65
&
&
&
&
37 & 38);
46 & 47 & 48);
56);
67 & 68 & 69);
proc means data=ee_all_clean(where=(dip11='71'));
var exper datdip_num;
run;
data ee_all_clean;set ee_all_clean;
imm=(immi=1);
run;
*Create the Immigration per Sector, and profession variable;
PROC SORT DATA = ee_all_clean; by NAFG17N QPRC;
RUN;
PROC MEANS NOPRINT DATA = ee_all_clean; by NAFG17N QPRC;
VAR IMM;
OUTPUT OUT = PART_IMMI MEAN=part_des_immi;
RUN;
proc print data=part_immi;
run;
*Insert the new table containing
profession in the main table;
the
immigration
per
sector
&
per
21
DATA ee_all_clean; merge ee_all_clean PART_IMMI; by NAFG17N QPRC;
RUN;
*We keep only observations relative to employees;
data ee_all_clean; set ee_all_clean(where=(stat2='2')); run;
*Inspection of qualitative variables;
proc freq data = ee_all_clean;
tables cspp cspm dip11 immi nafg17n qprc sexe salmet stat2;
run;
*Inspection of quantitative variables;
proc means data= ee_all_clean;
var exper logw hourly_wage nbheur part_des_immi
seniority;
run;
salmee
salred
salsee
proc means data=ee_all_clean;
class rgi;
var salmee;
run;
proc means data=ee_all_clean(where=(salmee ne .));
var salmee salred salsee seniority datdip_num nbheur totnbh_num cspm cspp;
run;
*Identification of inconsistent observations (experience and seniority);
data ee_all_clean; set ee_all_clean;
outlier1_exp_age=(exper-(age_num-12)>0);
outlier1_exp_sen=(exper-seniority<0);
run;
proc contents data=ee_all_clean;
run;
*Clear the table from the missing values;
data ee_all_clean_nomissing; set ee_all_clean;
if hourly_wage ne . & seniority ne . & exper ne .;
run;
*Check;
proc univariate data=ee_all_clean_nomissing;
var dif_exp_sen seniority exper;
run;
*Check of main variables;
proc univariate data=ee_all_clean_nomissing;
var logw hourly_wage exper seniority nbheur part_des_immi salmee dip11;
run;
*Check if there is inconsistent value relative to wage;
proc means mean min max data= ee_all_clean_nomissing;
class dip11;
var salmee hourly_wage;
run;
*Check the importance of outliers relative to experience;
proc freq data=ee_all_clean_nomissing;
tables outlier1_exp_age outlier1_exp_sen;
run;
22
*Defining outliers;
data ee_all_clean_nomissing; set ee_all_clean_nomissing;
outlier2_exp_sen=(exper-seniority<-1);
outlier_salmee=(salmee<=0);
outlier_nbheur=(nbheur<20 or nbheur>200);
outlier_hourly_wage= (hourly_wage < 5.0 or hourly_wage >100.0);
run;
*Check importance of outliers that we just create;
proc means data=ee_all_clean_nomissing;
var outlier_nbheur outlier_hourly_wage;
run;
*Clear the tables by suppressing all outliers;
data ee_all_clean2; set ee_all_clean_nomissing; run;
data ee_all_clean2; set ee_all_clean2; if outlier_hourly_wage = 0; run;
data ee_all_clean2; set ee_all_clean2; if outlier1_exp_age = 0; run;
data ee_all_clean2; set ee_all_clean2; if outlier2_exp_sen = 0; run;
data ee_all_clean2; set ee_all_clean2; if outlier_nbheur = 0; run;
*Check of main variables;
proc univariate data= ee_all_clean2(where=(hourly_wage ne . & seniority ne
. & exper ne . ));
var salmee hourly_wage exper dip11 part_des_immi;
run;
*we keep the first observation for each person;
data ee_all_clean_cross_section; set ee_all_clean2; by NAFG17N QPRC;
run;
*Check for instruments;
proc freq data= ee_all_clean_cross_section;
tables cspp_00 cspp_01 cspp_02 cspp_03 cspp_04 cspp_05
cspm_01 cspm_02 cspm_03 cspm_04 cspm_05 cspm_06 exper**2;
run;
cspp_06
cspm_00
*OLS estimation;
PROC REG data=ee_all_clean_cross_section
plots(maxpoints=30000) = residualhistogram;
MODEL logw = exper dip_71 dip_70 dip_60 dip_50 dip_42 dip_41 dip_33 dip_31
dip_30 dip_11 dip_10 part_des_immi /noint;
RUN;
*Instrumental
variables
estimation
+
Haussman
and
White
Test
(heteroskedasticity);
proc model data=ee_all_clean_cross_section;
logw= b1*exper + b2*dip_71 + b3*dip_70 + b4*dip_60 + b5*dip_50 + b6*dip_42
+ b7*dip_41 + b8*dip_33 + b9*dip_31 + b10*dip_30 + b11*dip_11 + b12*dip_10
+ b13*part_des_immi;
endogenous logw exper dip_71 dip_70 dip_60 dip_50 dip_42 dip_41 dip_33
dip_31 dip_30 dip_11 dip_10;
exogenous part_des_immi;
instruments _exog_ exper2 cspp_00 cspp_01 cspp_02 cspp_03 cspp_04 cspp_05
cspp_06
cspm_00
cspm_01
cspm_02
cspm_03
cspm_04
cspm_05
cspm_06
part_des_immi;
fit logw/ OLS 2SLS white KERNEL=(parzen,0,) HAUSMAN;
run;
*GMM estimation + White Test (heteroskedasticity);
proc model data=ee_all_clean_cross_section;
23
logw= b1*exper + b2*dip_71 + b3*dip_70 + b4*dip_60 + b5*dip_50 + b6*dip_42
+ b7*dip_41 + b8*dip_33 + b9*dip_31 + b10*dip_30 + b11*dip_11 + b12*dip_10
+ b13*part_des_immi;
endogenous logw exper dip_71 dip_70 dip_60 dip_50 dip_42 dip_41 dip_33
dip_31 dip_30 dip_11 dip_10;
exogenous part_des_immi;
instruments _exog_ exper2 cspp_00 cspp_01 cspp_02 cspp_03 cspp_04 cspp_05
cspp_06
cspm_00
cspm_01
cspm_02
cspm_03
cspm_04
cspm_05
cspm_06
part_des_immi;
fit logw/ GMM white KERNEL=(parzen,0,);
run;
24
Appendices 3 Tables:
Tables Dip 11:
DIP11
DIP11 Fréquence Pourcentage Fréquence Pctage.
cumulée cumulé
10
3394
10.76
3394
10.76
11
807
2.56
4201
13.32
30
312
0.99
4513
14.31
31
3685
11.68
8198
25.99
33
1020
3.23
9218
29.23
41
2153
6.83
11371
36.05
42
4115
13.05
15486
49.10
50
9211
29.20
24697
78.30
60
2399
7.61
27096
85.91
70
876
2.78
27972
88.69
71
3568
11.31
31540
100.00
Tables Experience:
N
Moyenne
Ecart-‐type
Minimum
Maximum
31540
22.0846861
12.3736197
0
67.0000000
25
Tables Part_des_immi
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
1
217047
0.10030
2
1
0
2
0.50000
3
7
0
2
0.00000
4
9
0
4
0.00000
0
75
0.17333
5 00
6 00
1
0
59
0.25424
7 00
2
0
80
0.32500
8 00
3
0
67
0.07463
9 00
4
0
274
0.18248
10 00
5
0
56
0.00000
11 00
7
0
102
0.06863
12 00
8
0
16
0.00000
13 00
9
0
42
0.04762
0
4439
0.02005
14 AZ
15 AZ
1
0
749
0.13218
16 AZ
2
0
739
0.07578
17 AZ
3
0
112
0.00000
18 AZ
4
0
242
0.02066
19 AZ
5
0
54
0.00000
20 AZ
7
0
114
0.07018
21 AZ
8
0
16
0.00000
22 AZ
9
0
39
0.00000
0
491
0.02444
23 C1
0
24 C1
1
0
1009
0.07136
25 C1
2
0
1428
0.05392
26 C1
3
0
247
0.05668
27 C1
4
0
853
0.08441
28 C1
5
0
437
0.01144
29 C1
7
0
399
0.04010
26
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
30 C1
8
0
40
0.10000
31 C1
9
0
69
0.00000
32 C2
2
0
9
0.00000
33 C2
3
0
8
0.00000
34 C2
4
0
14
0.14286
35 C2
5
0
21
0.00000
36 C2
7
0
32
0.03125
37 C2
8
0
8
0.25000
0
56
0.12500
38 C3
39 C3
1
0
340
0.08529
40 C3
2
0
998
0.06513
41 C3
3
0
634
0.06309
42 C3
4
0
343
0.05539
43 C3
5
0
305
0.03279
44 C3
7
0
1035
0.08599
45 C3
8
0
46
0.04348
46 C3
9
0
28
0.03571
0
24
0.00000
47 C4
48 C4
1
0
312
0.13141
49 C4
2
0
1025
0.07805
50 C4
3
0
564
0.05496
51 C4
4
0
174
0.06897
52 C4
5
0
212
0.04245
53 C4
7
0
918
0.06427
54 C4
8
0
3
0.00000
55 C4
9
0
24
0.04167
0
881
0.06697
56 C5
57 C5
1
0
1918
0.10584
58 C5
2
0
4430
0.06591
59 C5
3
0
1750
0.06171
27
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
60 C5
4
0
1427
0.07708
61 C5
5
0
1318
0.04932
62 C5
7
0
1912
0.04498
63 C5
8
0
80
0.10000
64 C5
9
0
76
0.10526
0
53
0.09434
65 DE
66 DE
1
0
262
0.16031
67 DE
2
0
525
0.08762
68 DE
3
0
558
0.04659
69 DE
4
0
455
0.06154
70 DE
5
0
602
0.03987
71 DE
7
0
656
0.03963
72 DE
8
0
29
0.13793
73 DE
9
0
61
0.03279
0
3232
0.15347
74 FZ
75 FZ
1
0
1704
0.23826
76 FZ
2
0
5190
0.19904
77 FZ
3
0
819
0.09524
78 FZ
4
0
1061
0.09614
79 FZ
5
0
750
0.08667
80 FZ
7
0
1022
0.05577
81 FZ
8
0
56
0.01786
82 FZ
9
0
150
0.05333
0
3690
0.10867
83 GZ
84 GZ
1
0
1352
0.08580
85 GZ
2
0
2904
0.08919
86 GZ
3
0
1160
0.06207
87 GZ
4
0
10578
0.07818
88 GZ
5
0
2050
0.07561
89 GZ
7
0
2986
0.05794
28
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
90 GZ
8
0
239
0.05021
91 GZ
9
0
214
0.09813
0
584
0.23973
92 HZ
93 HZ
1
0
703
0.13087
94 HZ
2
0
2450
0.06776
95 HZ
3
0
557
0.04129
96 HZ
4
0
3266
0.07379
97 HZ
5
0
1227
0.04238
98 HZ
7
0
1236
0.05016
99 HZ
8
0
28
0.00000
100 HZ
9
0
348
0.02586
0
1522
0.15900
101 IZ
102 IZ
1
0
326
0.21472
103 IZ
2
0
753
0.17928
104 IZ
3
0
90
0.18889
105 IZ
4
0
3819
0.18591
106 IZ
5
0
387
0.08269
107 IZ
7
0
289
0.16263
108 IZ
8
0
55
0.12727
109 IZ
9
0
131
0.20611
0
576
0.12847
110 JZ
111 JZ
1
0
33
0.09091
112 JZ
2
0
71
0.00000
113 JZ
3
0
695
0.05899
114 JZ
4
0
864
0.06597
115 JZ
5
0
387
0.03618
116 JZ
7
0
2781
0.09924
117 JZ
8
0
49
0.14286
118 JZ
9
0
103
0.05825
0
295
0.05424
119 KZ
29
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
120 KZ
1
0
32
0.00000
121 KZ
2
0
53
0.09434
122 KZ
3
0
434
0.00000
123 KZ
4
0
2361
0.03558
124 KZ
5
0
665
0.03308
125 KZ
7
0
2489
0.07151
126 KZ
8
0
125
0.09600
127 KZ
9
0
45
0.04444
0
429
0.08392
128 LZ
129 LZ
1
0
86
0.30233
130 LZ
2
0
98
0.33673
131 LZ
3
0
50
0.16000
132 LZ
4
0
1049
0.14871
133 LZ
5
0
211
0.04265
134 LZ
7
0
343
0.04956
135 LZ
8
0
28
0.00000
136 LZ
9
0
33
0.09091
0
2965
0.07825
137 MN
138 MN
1
0
2636
0.19613
139 MN
2
0
2391
0.19866
140 MN
3
0
1445
0.07336
141 MN
4
0
5992
0.14786
142 MN
5
0
1258
0.09618
143 MN
7
0
4217
0.08442
144 MN
8
0
193
0.06218
145 MN
9
0
392
0.11735
0
3253
0.06025
146 OQ
147 OQ
1
0
1836
0.10349
148 OQ
2
0
1829
0.09021
149 OQ
3
0
2143
0.05320
30
Obs. NAFG17N QPRC _TYPE_ _FREQ_ part_des_immi
150 OQ
4
0
25811
0.06586
151 OQ
5
0
8871
0.02390
152 OQ
7
0
13875
0.04029
153 OQ
8
0
152
0.01974
154 OQ
9
0
4857
0.07227
0
2018
0.09861
155 RU
156 RU
1
0
774
0.19638
157 RU
2
0
824
0.12985
158 RU
3
0
484
0.07645
159 RU
4
0
6746
0.18040
160 RU
5
0
497
0.08249
161 RU
7
0
1026
0.11988
162 RU
8
0
88
0.10227
163 RU
9
0
818
0.16137
31