Theses Doctoral

Data-Driven Decision Support for Low Electricity Access Settings

Fobi Nsutezo, Sally Simone

Universal, affordable and reliable electricity remains a key pillar towards achieving Sustainable Development Goals. It is low income countries that find bridging gaps in electricity access particularly challenging. Making judicious financial investments is critical in a low income setting as there are multiple competing compelling areas in which to make resource allocations. A data driven approach that can leverage prior data from electricity service providers can guide decision making.

This dissertation presents approaches that leverage such data, to assist utilities and national bodies with insights that could be useful. There are five unique contributions made. These are in the form of key results about electricity consumption patterns, novel methodologies for electricity demand prediction and relevant metrics for estimating the cost of a grid connection.

First, this thesis, through in-depth analysis of electricity data from thousands of households, sheds light on electricity consumption patterns in Rwanda and Kenya. This work revealed that utilities are increasingly connecting low consuming households whose consumption peaks sooner and plateaus lower than their peers who were connected earlier. While the previous focus of research has been on addressing electricity supply-side constraints, this work is the first of it's kind to show that electricity consumption for the newly electrified is very low, thereby making capital cost recovery of a grid connection even harder to achieve. This mismatch between supply and demand emphasizes the need for utilities to better quantify expected demand upon connection.

Secondly, this thesis makes methodological contributions that support electricity demand prediction for the yet-to-be grid-connected households. Specifically, Convolutional Neural Network (CNN) models were designed to take as inputs pre-grid-access daytime satellite image patches and output electricity consumption levels. Results from this work show that the proposed methodologies perform better than utility based estimates of anticipated demand. This methodology shows that rapid large scale evaluation of latent demand can be effectively performed using daytime satellite imagery, thereby giving guidance on which sites or regions are more suitable for grid versus off-grid technologies. Outputs from the models have been utilized by energy planners in Kenya.

The third unique contribution made in this dissertation is in the development of key metrics to estimate the cost of grid-access. Complementary to the evaluation of electricity demand, this thesis also develops an electricity grid network optimization model, connecting 9.2 million structures in Kenya. Given transformer placement and the estimates for low and medium voltage line, an approximation for the per household wire requirement is obtained. The work shows that traditional rural/urban classification based on population density may not be enough and is often deceiving in estimating the cost of grid-access and a new categorization based on our proposed per household wire requirement metrics provides more relevant estimates on the total cost.

Fourthly, this dissertation also demonstrates methods to re-purpose electricity data in order to provide insights to new domains such as household wealth. This work illustrates how household overall expenditure can be obtained from electricity usage data and how electricity usage can be obtained from daytime satellite imagery. This methodological contribution provides a pathway for stakeholders to estimate household overall expenditure from daytime satellite imagery. The work shows the value of electricity data in answering other questions in new domains without the deployment of additional surveys or hardware.

The final research contribution discussed in this thesis focuses on methods to make smart modifications to existing machine learning models to support analysis in settings where label availability is small and label quality is poor. This concept is illustrated with a building segmentation task given misaligned and omitted building footprints. Our proposed end-to-end learning pipeline demonstrates how data constrained regions can learn about building characteristics despite having incomplete and noisy labels. In addition, this work is used to provide explanatory features to the CNNs used for prediction in the earlier parts of the work.

While the focus of the research was on Kenya and Rwanda, this work transcends multiple domains such as water and internet access and can be extending to countries seeking evidence-based approaches to inform sustainable development.

Geographic Areas


  • thumnail for FobiNsutezo_columbia_0054D_17375.pdf FobiNsutezo_columbia_0054D_17375.pdf application/pdf 5.44 MB Download File

More About This Work

Academic Units
Mechanical Engineering
Thesis Advisors
Modi, Vijay
Ph.D., Columbia University
Published Here
August 3, 2022