The Indonesian Family Life Survey (IFLS) is an on-going longitudinal survey in Indonesia. The sample is representative of about 83% of the Indonesian population and contains over 30,000 individuals living in 13 of the 27 provinces in the country. A map identifying the 13 IFLS provinces is available on the Rand Family Life Surveys web site. The first wave of the IFLS (IFLS1) was conducted in 1993/94 by RAND in collaboration with Lembaga Demografi, University of Indonesia. IFLS2 and IFLS2+ were conducted in 1997 and 1998, respectively, by RAND in collaboration with UCLA and Lembaga Demografi, University of Indonesia. IFLS2+ covered a 25% sub-sample of the IFLS households. IFLS3, which was fielded in 2000 and covered the full sample, was conducted by RAND in collaboration with the Population Research center, University of Gadjah Mada. IFLS4, fielded in 2007/2008 on the same 1993 households and splitoofs, was conducted by RAND in collaboration with the Center for Population and Policy Studies (CPPS) of the University of Gadjah Mada and Survey Meter.
The 1993 Indonesia Family Life Survey (IFLS) provides data at the individual and family level on fertility, health, education, migration, and employment. Extensive community and facility data accompany the household data. The survey was a collaborative effort of Lembaga Demografi of the University of Indonesia and RAND, with support from the National Institute of Child Health and Human Development, USAID, Ford Foundation, and the World Health Organization. In Indonesia, the 1993 IFLS is also referred to as SAKERTI 93 (Survai Aspek Kehidupan Rumah Tangga Indonesia). The IFLS covers a sample of 7,224 households spread across 13 provinces on the islands of Java, Sumatra, Bali, West Nusa Tenggara, Kalimantan, and Sulawesi. Together these provinces encompass approximately 83 percent of the Indonesian population and much of its heterogeneity. The survey brings an interdisciplinary perspective to four broad topic areas:
• Fertility, family planning, and contraception
• Infant and child health and survival
• Education, migration and employment
• The social, economic, and health status of adults, young and old
Additionally, extensive community and facility data accompany the household data. Village leaders and heads of the village women's group provided information in each of the 321 enumeration areas from which households were drawn, and data were collected from 6,385 schools and health facilities serving community residents.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
The household survey collected detailed data on the following:
(a) HOUSEHOLD QUESTIONNAIRE
1. Household Characteristics
- Household composition, consumption
- Health provider knowledge
- Household Economy
- Farm and non-farm business
- Labor and non-labor income
- Household assets
- Economic shocks
- Health insurance
2. Adult Information (ages 15+):
- Marriage histories
- Health status
- Acute morbidity
- Health care utilization
- Non-coresident parents
- Siblings and children
- Interhousehold transfers
- Individual assets
3. Ever-Married Female Information (ages 15-49)
- Pregnancy history
- Contraceptive knowledge and use
- Contraceptive calendar
- Infant feeding practices
4. Child Information (ages 0-14)
- Education history
- Acute morbidity
- Health care utilization
1. Community Characteristics :
- Transportation electricity
- Water and sanitation
- Agriculture and industry
- History and climate
- History of schools and health facility
- Village statistics
- Prices of food stuffs, history of school and health facility availability
2. Government Health Center/ Sub-Health Center, Private Physicians and Clinics, Nurses, Midwives and Paramedics:
- Facility management and history
- Service availability
- Staff and equipment
- Family planning
- Vignettes on types of care
3. Community Health Posts and Family Planning Posts, Traditional Health Practitioners:
- Facility management and history
- Service availability
- Staff and equipment
- Family planning
4. Primary School, Junior Secondary School, Senior Secondary School:
- Staff and school characteristics
- Classroom characteristics
- Student test scores
Agriculture & Rural Development
Food (production, crisis)
Land (policy, resource management)
Access to Finance
Migration & Remittances
Population & Reproductive Health
Household Survey data were collected for household members through direct interviews (for adults) and proxy interviews (for children, infants and temporarily absent household members). The IFLS-1 conducted detailed interviews with the following household members:
- The household head and their spouse
- Two randomly selected children of the head and spouse aged 0 to 14 (interviewed by proxy)
- An individual age 50 and above and their spouse, randomly selected from remaining members
- For a randomly selected 25 percent of the households, an individual age 15 to 49 and their spouse, randomly selected from remaining members.
The Community and Facility Survey collected data from a variety of respondents including: the village leader and his staff and the leader of the village women's group; Ministry of Health clinics and subclinics; private practices of doctors, midwives, nurses, and paramedics; community-based health posts and contraceptive distribution centers; public, private, and religious elementary schools; public, private, and religious junior high schools; public, private, and religious senior high schools. Unlike many other surveys, the sample frame for the survey of facilities was drawn from the list of facilities used by household survey respondents in the area.
Producers and sponsors
Lembaga Demografi (LD)
University of Indonesia
National Institute of Child Health and Human Development
United States Agency for International Development
World Health Organization
National Institute of Child Health and Human Development
Funding for revised IFLS1 data (IFLS1-RR) and documentation
1. HOUSEHOLD SELECTION
The IFLS sampling scheme stratified on provinces, then randomly sampled within provinces. Provinces were selected to maximize representation of the population, capture the cultural and socioeconomic diversity of Indonesia, and be cost effective given the size and terrain of the country. The far eastern provinces of East Nusa Tenggara, East Timor, Maluku and Irian Jaya were readily excluded due to the high costs of preparing for and conducting fieldwork in these more remote provinces. Aceh, Sumatra's most northern province, was deleted out of concern for the area's political violence and the potential risk to interviewers. Finally, due to their relatively higher survey costs, we omitted three provinces on each of the major islands of Sumatra (Riau, Jambi, and Bengkulu), Kalimantan (West, Central, East), and Sulawesi (North, Central, Southeast). The resulting sample consists of 13 of Indonesia's 27 provinces: four on Sumatra (North Sumatra, West Sumatra, South Sumatra, and Lampung), all five of the Javanese provinces (DKI Jakarta, West Java, Central Java, DI Yogyakarta, and East Java), and four provinces covering the remaining major island groups (Bali, West Nusa Tenggara, South Kalimantan, and South Sulawesi). The resulting sample represents 83 percent of the Indonesian population. (see Figure 1.1 of the Overview and Field Report in External Documents). Table 2.1 of the same document shows the distribution of Indonesia's population across the 27 provinces, highlighting the 13 provinces included in the IFLS sample.
The IFLS randomly selected enumeration areas (EAs) within each of the 13 provinces. The EAs were chosen from a nationally representative sample frame used in the 1993 SUSENAS, a socioeconomic survey of about 60,000 households. The SUSENAS frame, designed by the Indonesian Central Bureau of Statistics (BPS), is based on the 1990 census. The IFLS was based on the SUSENAS sample because the BPS had recently listed and mapped each of the SUSENAS EAs (saving us time and money) and because supplementary EA-level information from the resulting 1993 SUSENAS sample could be matched to the IFLS-1 sample areas. Table 2.1 summarizes the distribution of the approximately 9,000 SUSENAS EAs included in the 13 provinces covered by the IFLS. The SUSENAS EAs each contain some 200 to 300 hundred households, although only a smaller area of about 60 to 70 households was listed by the BPS for purposes of the annual survey. Using the SUSENAS frame, the IFLS randomly selected 321 enumeration areas in the 13 provinces, over-sampling urban EAs and EAs in smaller provinces to facilitate urban rural and Javanese-non-Javanese comparisons. A straight proportional sample would likely be dominated by Javanese, who comprise more than 50 percent of the population. A total of 7,730 households were sampled to obtain a final sample size goal of 7,000 completed households. Table 2.1 shows the sampling rates that applied to each province and the resulting distribution of EAs in total, and separately by urban and rural status. Within a selected EA, households were randomly selected by field teams based upon the 1993 SUSENAS listings obtained from regional offices of the BPS. A household was defined as a group of people whose members reside in the same dwelling and share food from the same cooking pot (the standard BPS definition). Twenty households were selected from each urban EA, while thirty households were selected from each rural EA. This strategy minimizes expensive travel between rural EAs and reduces intra-cluster correlation across urban households, which tend to be more similar to one another than do rural households. Table 2.2 (Overview and Field Report) shows the resulting sample of IFLS households by province, separately by completion status.
(b) SELECTION OF RESPONDENTS WITHIN HOUSEHOLDS
For each household selected, a representative member provided household-level demographic and economic information. In addition, several household members were randomly selected and asked to provide detailed individual information.
2. THE COMMUNITY SURVEY SAMPLING PROCEDURE
The goal of the CFS was to collect information about the communities of respondents to the household questionnaire. The information was solicited in two ways. First, the village leader of each community was interviewed about a variety of aspects of village life (the content of this questionnaire is described in the next section). Information from the village leader was supplemented by interviewing the head of the village women's group, who was asked questions regarding the availability of health facilities and schools in the area, as well as more general questions about family health in the community. In addition to the information on community characteristics provided by the two representatives of the village leadership, we visited a sample of schools and health facilities, in which we conducted detailed interviews regarding the institution's activities. A priori we wanted data on the major sources of outpatient health care, public and private, and on elementary, junior secondary, and senior secondary schools. We defined eight strata of facilities/institutions from which we wanted data. Different types of health providers make up five of the strata, while schools account for the other three. The five strata of health care providers are: government health centers and subcenters (puskesmas, puskesmas pembantu); private doctors and clinics (praktek umum/klinik); the private practices of midwives, nurses, and paramedics (perawats, bidans, paramedis, mantri); traditional practitioners (dukun, sinshe, tabib, orang pintar); and community health posts (posyandu, PPKBD).The three strata of schools are elementary, junior secondary, and senior secondary. Private, public, religious, vocational, and general schools are all eligible as long as they provide schooling at one of the three levels. Our protocol for selecting specific schools and health facilities for detailed interview reflects our desire that selected facilities represent the facilities available to members of the communities from which household survey respondents were drawn. For that reason, we were hesitant to select facilities based solely either on information from the village leader or on proximity to the village center. The option we selected instead was to sample schools and health care providers from lists provided by respondents to the household survey. For each enumeration area lists of facilities in each of the eight strata were constructed by compiling information provided by the household regarding the names and locations of facilities the household respondent either knew about or used. To generate lists of relevant health and family planning facilities, the CFS drew on two pieces of information from the household survey. The IFLS queried wives of household heads as to whether they, a family member, a friend, or someone else they knew had ever used a particular health facility, such as a health center (section PP of Book I, excerpted in Appendix B). When women responded positively, they were asked to provide the name and location of a facility of that type. When women responded negatively, they were asked if they knew of any facilities of that type, and if so, were asked about the name and location of the facility. These responses provided one source of information regarding health facilities of relevance to community members. Information was collected for four types of facilities/providers: government health centers and subcenters; private clinics; private doctors' practices; the practices of nurses, midwives, and paramedics; and traditional practitioners.
The lists of schools were obtained in a slightly different manner. The respondent to the household roster (Section AR, Book I, excerpted in Appendix B) provided the name and location of all schools currently attended by household members under 25 years of age. Consequently, the lists of schools compiled from household information are all schools attended by at least one member of at least one IFLS household. For each enumeration area eight lists of facilities (one per strata) were constructed based on the combined household responses from that EA. Tables 3.1 and 3.2 (Overview and Field Report) provide the cumulative distributions of the numbers of facilities (by strata) identified within EAs. For example, the combined number of health centers identified was less than six in 80 percent of the 132 rural EAs in which we interviewed. The combined numbers of health centers identified was less than six in 68 percent of the 189 urban EAs in which we interviewed. Thus, on average, the combined household responses in urban EAs generate a longer list of health centers than do the combined responses in rural EAs. On average, the lists are longer in urban areas than in rural areas for doctors/clinics and all levels of schools as well. However, on average, the lists are longer in rural areas than in urban areas for nurses/midwives and for traditional practitioners.
1. Household Survey:
Of the 7,730 households sampled, a complete interview was obtained for 7,039 households or 91.1 percent of households. A partial interview (i.e., roster-level information was obtained but only a subset of selected household members were interviewed) was obtained for another 185 households (2.4 percent of households), while 506 sampled households (6.5 percent) were not interviewed.2 The completion rate ranged from a low of 87 percent to a high of 97 percent across the thirteen provinces. The final sample of 7,224 partially or fully completed households consists of 3,436 households in urban areas (90.7 percent partial/full completion rate), and 3,788 households in rural areas (95.9 percent partial/full completion rate).
2. Community and Facility Survey:
Not all identified facilities are eligible for interview. Facilities were excluded if they had been interviewed in connection with a previous EA, if they were more than a 45 minute motorcycle trip, or if they were located in another province. The facilities on each list were ranked by frequency of mention. These ranked lists provided frames for each stratum from which a sample of two to four facilities was drawn. In all strata, the most frequently mentioned facility was always visited. Additional facilities were randomly selected to fill the quota for that stratum. In each EA, the interview target for health centers and sub-centers was four. The target was three for nurse/midwife/paramedic's practices, community health posts, elementary schools, and junior secondary schools. The target was two for senior secondary schools, traditional practitioners, and doctors' practices/clinics. In some enumeration areas the pooled household responses did not generate a sufficient number of facilities to fill the quota.
(a) HOUSEHOLD WEIGHTS
1. Household weights
The household weights are designed to correct for the over-sampling of urban EAs and EAs in smaller provinces discussed above and summarized in Table 2.1 (Overview and Field Report), as well as the differential sampling rates in urban and rural EAs. When the household weights are applied to the IFLS household sample, the resulting weighted distribution will reflect the 1993 distribution of households by urban and rural status within each of the 13 Indonesian provinces covered by the IFLS. The 1993 distribution of households by province and urban/rural status was generated from 1993 projected population counts provided by BPS and from average household sizes computed from the 1993 SUSENAS. BPS projected population counts were divided by average household sizes to get an estimate of the number of households in 1993 in each province/urban-rural strata.
2. Individual weights
The public use file contains three types of individual weights: respondent weights, roster weights, and anthropometry weights.
Respondent weights. The respondent weights are designed to adjust for the within household sampling scheme used to select respondents for detailed interview. From the household roster, the number of household members eligible to be a Book III, IV or V respondent within each household was determined based on the intra-household sampling rules discussed above.
3. Roster weights
The roster weights are designed so that the weighted age and sex distribution of individuals in the household roster data will reflect the 1993 population age and sex distribution by urban and rural strata within the 13 provinces covered by the survey. Five-year age groupings were used, where individuals age 75 and older were treated as one group. The population distribution was based on data from the 1993 SUSENAS. The roster weight is the ratio of the 1993 SUSENAS population proportion to the household roster proportion for the given province/urban-rural/sex/age group strata into which the individual falls. A roster weight was calculated for all household members listed in the roster (Book I, section AR). If the individual's age was missing, an age group for the individual was imputed. The imputation involved examining the age of the individual's spouse and children; if the individual was a Book III, IV or V respondent, dates and ages provided in those sections were used as part of the imputation.
4. Anthropometry weights
The anthropometry weights are designed to account for the intra-household sampling scheme used to select the respondents who were weighed and measured. All respondents of Books III, IV or V and any additional children under age 6 living in the household were eligible for anthropometric measurement. Respondents of Books III, IV and V who were measured were given an anthropometry weight equal to their respondent weight (unnormed and uncapped); other children under age 6 were given the household weight (based on the 7,224 household sample). Household members who were measured but not eligible (i.e., they did not fit the selection criteria) were given an anthropometry weight of zero. The initial anthropometry weight was then normalized to sum to the number of those across all households who were eligible to be measured, to account for the fact that not all household members eligible for anthropometric measurement were actually measured. Finally, as with the respondent weight, the anthropometry weight was capped at 3 to control for those with very small probabilities of selection.
(b) COMMUNITY AND FACILITY SURVEY WEIGHTING
1. Community Weights
The community weights are designed to correct for the over-sampling of urban EAs and EAs in smaller provinces. When weighted, the CFS communities reflect the number of EAs in the province/urban-rural strata in which the community lies. The total number of EAs in a given province and urban-rural strata was computed using 1993 SUSENAS sampling frame data from BPS. The community weight variable is the ratio of the number of actual EAs to the number of sampled EAs.
2. Facility Weights
Ideally a facility should receive a weight that is equal to that facility's sampling probability, where the sampling probability is a function of the sampling scheme and the sampling frame.
Dates of Data Collection
Data Collection Mode
1. Data Entry
All data entry was conducted centrally in Jakarta by a staff of data entry personnel. Data entry supervisors were members of LD's permanent staff, while keypunchers were recruited from local universities for the data entry period. Data entry personnel were trained in data entry techniques and in the use of ISSA, a computer-assisted data entry program that allowed immediate checks on data consistency and logic. Once an enumeration area was completed, the questionnaires were packed and shipped to Jakarta with a packing sheet identifying the enclosed questionnaires by number. Questionnaires were then assigned for data entry in batches by enumeration area. Data were entered using ISSA with 100 percent verification (i.e., double entered). Batch editing programs were used in Indonesia to further check the data for completeness and consistency. The data was transcribed from the recording forms into the PC-based data entry system ISSA(Integrated System for Survey Analysis)6 , by staff at Lembaga Demografi (LD) of the University of Indonesia. The data entry program was developed by Nick Murray of RAND with the assistance of LD staff. All data was 100%-verified at data entry (i.e., double entered) and the data entry program contained checks on valid ranges and skip patterns. Upon receipt of the IFLS data at RAND, the ISSA ASCII files were converted into SAS® files for use in the data cleaning process and the preparation of a public use file version of the data. Due to the double entry and data entry program checks, data entry errors were basically nil. The source of remaining data errors was interviewer error and respondent error. Based on problems uncovered so far, there appears to be about a 1-2 percent interviewer/respondent error rate. For files that have, for example, 20,000 records, the 1-2 percent error rate suggests 200-400 records with potential problems. In more complicated sections of the questionnaire, this rate may be a bit higher.
2. Data Cleaning
Given the size and complexity of the IFLS-HH and IFLS-CF databases and the available project resources, the preparation of the public use files required a data cleaning strategy that would meet basic user needs and make the data available to the research community in a reasonable time frame. Given 100%-verification at data entry, the basic approach, then, was to concentrate on those data cleaning activities which required access to information that was privacy protected. Such cleaning activities could only be done at RAND. Priorities were given to the cleaning of identifier variables--respondent identifiers, anthropometry roster identifiers, household members mentioned elsewhere in the IFLS-HH besides the household roster, the non-coresident sibling and children rosters, and facility identifiers. Within the IFLS-HH, efforts also focused on trying to clean the household roster data so that it could serve as the main source of basic demographic information on household members. Users could then take information from the household roster and use it throughout to provide consistency in characteristics. Additional areas where data checking efforts were made reflect those sections of interest to projects within the P01 grant that included original IFLS funding and those of interest to the report prepared for AID, a sponsor of the survey. Those areas included anthropometric data, income data, outpatient and inpatient utilization, education status and expenses, pregnancy histories and infant feeding, and interhousehold transfers. Efforts also focused on trying to provide as much translated material as possible. Users should be aware that similar information was sometimes collected in more than one section and sometimes from different individuals. One data preparation activity that was not able to be done in much detail before public release was the examination of inconsistencies in responses by household members to the same item or event, or by a given respondent to the same event asked about in more than one place. In general, the public use files do not include efforts to reconcile possible differences.
3. Observations on Data Quality
Data problems that were uncovered in the survey are discussed in detail in the IFLS1 User's Guide under "Observations on Data Quality" pages 8-29. Many of the problems discussed have been corrected in the IFLS public use database so are noted in the User's' Guide data quality section. In some cases, however, only suggested corrections are provided via the “fixes” files described above and are noted accordingly. In other cases, decisions on how to handle a particular problem belong in the hands of the research analyst and in such cases, we alert users to the type of problems we have uncovered, but do not provide suggested fixes. The discussion in the User's Guide may help users understand remaining interviewer and respondent errors not detected before public release. The User's Guide is provided as external resources.
Data Cleaning and Public Use File Creation:
Our experiences with the public release of other survey data such as MFLS-1 and MFLS-2 have led us to develop a policy of cleaning -but not 'overcleaning' - public use data. In addition, since most researchers will want to construct their own analysis files, merging and selecting from the data in several ways, the public use files are designed to give users the flexibility they need to put together different types of analysis files. Upon completion of data entry, the keypunched data were shipped to RAND in Santa Monica for data cleaning and public use file preparation. Since all data were 100 percent verified at data entry and the data entry program contained checks on valid ranges and skip patterns, data entry errors were basically non-existent. Consequently, data cleaning efforts initially focused on those activities which required access to information that was privacy protected, such as individual and facility identifiers. In addition, the principal survey materials such as questionnaires and interviewer manuals were translated from Bahasa Indonesia to English. After the initial public release of the IFLS data, subsequent data cleaning efforts sponsored by RAND projects will continue and results of those efforts will be made available to the IFLS user community through Family Life Surveys Home Page on the World Wide Web (http://www.rand.org/organization/drd/labor/FLS) and the FLS Newsletter.
The IFLS data are placed in the public domain to support research analyses. As a user of the IFLS public use files, you are expected to respect the anonymity of all our respondents. This means that you will make no attempt to identify any individual, household, family, service provider or community other than in terms of the anonymous codes used in the IFLS.
Please do not distribute these data. The data are freely available on our website. It is useful for everyone if we maintain a list of all users. If you plan to work with other people using these data, please ask them to register or register them yourself. If you are a data librarian, please ask users to register on our web page if they obtain a copy of the data from you.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses