Ethiopia is one of seven countries being supported by the World Bank, through funding from the Bill and Melinda Gates Foundation (BMGF), to strengthen the production of household-level data on agriculture. The LSMS-ISA, has the over-arching objective of improving our understanding of agriculture in Sub-Saharan Africa; specifically, its role in household welfare and poverty reduction. The implementation will boost the data collection capacity of the national statistical organizations and the quality of household-level agriculture statistics. Also, the data will provide the basis of analysis looking for insights into how innovation and efficiency can be fostered in the agriculture sector.
The ESS began as ERSS (Ethiopia Rural Socioeconomic Survey) in 2011/12. The first wave of data collection in 2011/12 included only rural and small town areas. The survey name dropped the word “Rural” in the second wave of data collection when the sample was expanded to include all urban areas. The urban supplement was done in such a way to ensure that the ESS data can provide nationally representative estimates. Accordingly, the number of enumeration areas (EAs) covered by the survey increased from 333 (or 3,776 households) to 433 (or 5,262 households). For the rest of this document, ESS will refer generally to the survey, ESS-W1 will refer to the first wave of the ESS carried out in 2011/12; ESS-W2 will refer to the second wave of the ESS carried out in 2013/14 and ESS-W3 will refer to the third wave of the ESS carried out in 2015/2016. ESS-W1, ESS-W2, and ESS-W3 together create a panel data set of households from rural and small town areas (i.e. the same households that were interviewed in ESS-W1 were tracked and re-interviewed in ESS-W2 and ESS-W3). ESS-W2 and ESS-W3 together represent a panel of households and individuals for rural and all urban areas.
The Ethiopia Socioeconomic Survey (ESS) is a collaborative project between the Central Statistics Agency (CSA) of Ethiopia and the World Bank Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) team. The objective of the LSMS-ISA is to collect multi-topic, household-level panel data with a special focus on improving agriculture statistics and generating a clearer understanding of the link between agriculture and other sectors of the economy. The project also aims to build capacity, share knowledge across countries, and improve survey methodologies and technology.
ESS is a long-term project to collect panel data. The project responds to the data needs of the country, given the dependence of a high percentage of households in agriculture activities in the country. The ESS collects information on household agricultural activities along with other information on the households like human capital, other economic activities, access to services and resources. The ability to follow the same households over time makes the ESS a new and powerful tool for studying and understanding the role of agriculture in household welfare over time as it allows analyses of how households add to their human and physical capital, how education affects earnings, and the role of government policies and programs on poverty, inter alia. The ESS is the first panel survey to be carried out by the CSA that links a multi-topic household questionnaire with detailed data on agriculture.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
The scope of the ESS includes:
- Household: Household characteristics; household roster; education; health, time use and labour; savings, fiid last 7 days; food aggregate; non-food expenditure; food security; shocks, housing; assets; non-farm expenditure; other income; assistance; credit.
- Community: Informant roster; basic information; access to basic services; economic activities; agriculture (only for rural eas); changes; community needs and actions; productive safety nets programme; market prices.
- Post Harvest: Household roster; crop roster; crop harvest by field; unit and size codes; harvest labour; non-permanent crop roster; crop disposition; permanent crop roster; permanent crops; network roster.
- Post planting : Parcel Roster; field roster; crop roster; seeds roster; miscellaneous questions for the holder; network roster; rope and compass measurement; crop cut by field.
- Livestock: ownership; change in stock; breeding; house, water and feed; animal health; milk production; egg production; animal power and dung; household roster; education codes.
ESS uses a nationally representative sample of over 5,000 households living in rural and urban areas. The urban areas include both small and large towns.
Producers and sponsors
Central Statistical Agency of Ethiopia
The World Bank
National Bank of Ethiopia
The sample is a two-stage probability sample. The first stage of sampling entailed selecting primary sampling units, or CSA enumeration areas (EAs). A total of 433 EAs were selected based on probability proportional to size of the total EAs in each region. For the rural sample, 290 EAs were selected from the AgSS EAs. A total of 43 and 100 EAs were selected for small town and urban areas, respectively. In order to ensure sufficient sample size in the most populous regions (Amhara, Oromiya, SNNP, and Tigray) and Addis Ababa, quotas were set for the number of EAs in each region. The sample is not representative for each of the small regions including Afar, Benshangul Gumuz, Dire Dawa, Gambella, Harari, and Somalie regions. However, estimates can be produced for a combination of all smaller regions as one “other region” category. A more detailed description of the sample design is provided in Section 3 of the Basic Information Document provided under the Related Materials tab.
During wave 3, 1255 households were re-interviewed yielding a response rate of 85 percent. Attrition in urban areas is 15% due to consent refusal and inability to trace the whereabouts of sample households.
The ESS-W3 data needs to be weighted to represent the national-level population of rural, small and large town households. A sample weight with post-stratification adjustments was calculated for the households and this weight variable is included in all the datasets.20 It reflects the adjusted probability of selecting the household into the sample. The inverse of this weight can be considered an expansion factor that sums to the total population of households in the nation. When this weight is used in a household-level file, it sums to the population of households. When this weight is used in an individual-level file, it sums to the population of individuals. If the data user wishes to produce an estimate for the population of individuals in a household-level file, an approximate expansion factor is the sample weight times the household size of each household.
The ESS3 sample weights were calculated in two stages. In the first stage, weights were separately calculated or adjusted for the three different sampling frames (rural, small town, and large town21). For the rural and small town sample, the wave 1 weights were adjusted to account for relisting, non-response, and attrition of households in the sample frame between the two waves (wave 1 and wave 3). In each of the waves, the rural and small town EAs were re-listed which reflects EA-specific population growth patterns. The post-stratification adjustment accounts for this change.
Similarly for the mid- and large-town sample, the wave 1 weights were adjusted to account for relisting, non-response, and attrition of households in the sample frame between the two waves (wave 2 and wave 3). In each of the waves, the mid- and large-town EAs were re-listed which reflects EA-specific population growth patterns. The post-stratification adjustment accounts for this change.
Dates of Data Collection
Post-planting agriculture and Livestock questionnaires
Crop cut questionnaire
Household, Community, and Post-harvest agriculture questionnaire
All modules in large town EAs
Data Collection Mode
Mixed data collection mode
The interviews were carried out using pen-and-paper (PAPI) as well as computer-assisted personal interviewing (CAPI) method. A concurrent data entry arrangement was implemented for PAPI. In this arrangement, the enumerators did not wait until all the interviews were completed. Rather, once the enumerators completed approximately 3-4 questionnaires, supervisors collected these interviews from enumerators and brought them to the branch offices for data entry. This process took place as enumerators continued administering interviews with other households. Then questionnaires were keyed at the branch offices as soon as they were completed using the CSPro data entry application software. The data from the completed questionnaires were then checked for any interview or data entry errors using a STATA program. Data entry errors were flagged for the data entry clerks and the interview errors were then sent to back to the field for correction and feedback to the ongoing interviews. Several rounds of this process were undertaken until the final data files were produced. Additional cleaning was carried out, as needed, by checking the hard copies. In ESS3, CAPI (with a Survey Solutions platform) was used to collect the community data in large town areas.
The ESS collects confidential information on respondents. The confidential variables pertain to (i) names of the respondents to the household and community questionnaires, (ii) village and constituency names, (iii) descriptions of household dwelling and agricultural field locations, (iv) phone numbers of household members and their reference contacts, (v) GPS-based dwelling and agricultural field locations, (vi) names of the children of the head/spouse living elsewhere, (vii) names of the deceased household members, (viii) names of individuals listed in the network roster, and (ix) names of field staff. To maintain confidentiality, this information is not included in the ESS public use data.
To partially satisfy user interest in geo-referenced location, while preserving the confidentiality of sample household and communities, modified EA-level coordinates are provided as part of the household geovariable table. Modified coordinates are generated by applying a random offset within a specified range to the average EA value (following the MeasureDHS approach). For households that have moved between waves 1 and 3, and are more than 5 km from their baseline location, the offset is with respect to the new household location. More specifically, the coordinate modification strategy relies on random offset of EA center-point coordinates (or average of household GPS locations by EA in ESS) within a specified range determined by the urban and rural classification. For small towns and urban areas, an offset range of 0-2 km is used. In rural areas, where communities are more dispersed and risk of disclosure may be higher, a range of 0-5 km offset is used. Additionally, an offset range of 0-10 km is applied to 1% of EAs, effectively increasing the known range for all points to 10 km while introducing only a small amount of noise. Offset points are constrained at the zone level, so that they still fall within the correct zone for spatial joins, or point-in-polygon overlays. The result is a set of coordinates, representative at the EA level, that fall within known limits of accuracy. Users should take into account the offset range when considering different types of spatial analysis or queries with the data. Analysis of the spatial relationships between locations in close proximity would not be reliable. However, spatial queries using medium or low resolution datasets should be minimally affected by the offsets
Before being granted access to the dataset, all users have to formally agree:
1. To make no copies of any files or portions of files to which s/he is granted access except those authorized by the data depositor.
2. Not to use any technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.
3. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in her/his analysis will be immediately brought to the attention of the data depositor.
Public use files, accessible to all
Use of the dataset must be acknowledged using a citation which would include:
- the identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Central Statistical Agency of Ethiopia. Ethiopia Socioeconomic Survey,Wave 3 (ESS3) 2015-2016. Public Use Dataset. Ref: ETH_2015_ESS_v02_M. Downloaded from[URL] on [Date]
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.