WIDER Working Paper 2019/111-The effects of the Maputo ring road on the quantity and quality of nearby housing

Using convolutional neural networks applied to satellite images covering a 25 km x 12 km rectangle on the northern outskirts of Greater Maputo, we detect and classify buildings from 2010 and 2018 in order to compare the development in quantity and quality of buildings from before and after construction of a major section of ring road.In addition, we analyse how the effects vary by distance to the road and conclude that the area has seen large overall growth in both quantity and quality of housing, but it is not possible to distinguish growth close to the road from general urban growth.Finally, the paper contributes methodologically to a growing strand of literature focused on combining machine-learning image recognition and the availability of high-resolution satellite images. We examine the extent to which it is possible to exploit these methods to analyse changes over time and thus provide an alternative (or complement) to traditional impact analyses.


Introduction
In 2011, the Government of Mozambique launched the Maputo Ring Road construction project, which aimed to connect the main entry roads into the city and allow traffic to and from South Africa to bypass the city centre. Work began in 2012 and most of the road was finished by 2015, although one intersection (Tchumene) was still under construction at the time of writing but is expected to be completed early in 2020. The total cost of the project is around US$315 million, of which US$300 million has been financed through a loan from the Chinese state-owned bank Exim-Bank. The project has included improvements to 22 km of existing road and the construction of 52 km of new road (Government of Mozambique 2012). The latter will be the main focus of this paper.
The economic benefits of the ring road-mostly in the form of shorter travel times-have already spread throughout Maputo and most of southern Mozambique. Locals enjoy a shorter commute, business owners have lower transportation costs, and tourists can more easily reach the sandy beaches of Mozambique. 1 Further, following completion in 2017 of the Maputo Bay Bridge (the largest suspension bridge in Africa), which links Maputo to its southern hinterland and to Durban, the road infrastructure in Mozambique's capital has had a much-needed upgrade, relieving pressure on several transport bottlenecks. This study will not attempt to assess the impacts of these infrastructure projects on the overall economy of Mozambique or Maputo. Its main purpose is to evaluate the changes to housing conditions in the areas surrounding the newly built sections of the ring road.
When observing aerial photography or high-resolution satellite images of the northern parts of Greater Maputo, it is clear that widespread urbanization has occurred during the past decade. Some of this may have happened irrespectively of the construction of the ring road; however, academics and policy makers have an interest in knowing more about how the ring road project has contributed to this development. For instance: Where specifically has urban expansion taken place? Are the developments concentrated around the newly built stretch of road, or are they determined by distance to the city centre? Can we learn anything about the quality of houses in this 'new' part of the city, and how does proximity to the road affect this, if at all? Knowing the answers to these questions and more can shed light on some of the effects of improving urban infrastructure which are often not measured by interview-based impact assessments.
Using convolutional neural networks (CNNs) trained on around 200,000 tagged satellite images of structures and non-structures in urban areas in Mozambique, we detect and classify buildings in high-resolution satellite images for the years 2010 and 2018 in order to compare the distribution of houses before and after construction of the road. In addition, we estimate changes to the share of buildings with high-quality roofs and the share of buildings under construction, and analyse the effect on these estimates arising from their distance to the road. Finally, the paper contributes to a growing strand of literature focused on combining recent advances in machine-learning image recognition and the availability of high-resolution satellite images. In particular, we test the extent to which it is possible to exploit these methods to analyse changes over time and thus provide an alternative (or complement) to traditional impact analyses.

Context and literature
The availability of high-resolution images of the earth is increasing rapidly, leading to interesting new applications. The availability of these images also provides opportunities to improve public policies, especially in data-scarce environments, including many developing countries. While the increase in the number of satellite images, such as the free images provided by the European Space Agency's sentinel satellites, is partly due to government investment, it may be due more to the expansion of commercial satellites. The imaging company Planet Labs currently maintains 140 satellites, enough to pass over every place on earth once a day. Maxar, formerly DigitalGlobe, is building a constellation that will be able to revisit spots 15 times a day, and BlackSky Global promises to revisit most major cities up to 70 times a day (Beam 2019).
The usability of satellite images for public policies has improved markedly through the application of CNNs for image recognition. For instance, CNNs can detect specific features in images which can support analysis and policies. Examples include identification of washed-away buildings after a tsunami (Fujita et al. 2017); detection of building footprints, including in many African countries (Tiecke et al. 2017;Liu et al. 2019); tracking of icebergs at sea (Malmgren-Hansen et al. 2019), 24/7 surveillance of the quality of infrastructure such as bridges (Avci et al. 2017); and estimation of poverty at the neighbourhood level in urban areas in Mozambique (Fisker et al. 2019a). This study builds upon partial results from Fisker et al. (2019a), namely the identification of structures of various types and qualities from high-resolution satellite images. Lall et al. (2017) set the context for describing the potential of infrastructure investments in African urban centres by listing a number of stylized facts about large African cities that constrain them from reaching their growth potential. First, they tend to be crowded with people rather than physical capital. Central business districts are generally characterized by a high share of paved roads, which is comparable to same-size cities in the Western world. However, paved roads are very scarce outside of the city centres. This is not optimal since it generates congestion problems in the centre as well as long commutes for people living in the periphery. Second, cities are made up of separate neighbourhoods that are not well connected. This means that the number of available jobs within a reasonable travel distance from a given location is low. This is especially problematic for the formal sector, as informal jobs are often carried out in or close to people's homes. Third, and perhaps as a result of the lack of physical infrastructure, population density is also high close to the city centres, but quickly tapers off in the outskirts. In Maputo, 30 per cent of land within 5 km of the central business district is left undeveloped (Lall et al. 2017).
In one of the few studies which attempts to measure the citywide effects of the construction of new transport routes, Bernard et al. (2016) show that, in Kampala, a northern bypass (similar in size to the Maputo ring road) has reduced the cost of living for high-skilled labour by 19 per cent and by 3 per cent for low-skilled labour-a welfare effect of eight times the construction costs. Their model also captures the direct impact on transport costs as well as the more long-term agglomeration effects that arise from the location choices of firms and households.

Hypotheses
The findings of the previous studies described above lead to the formulation of the following hypotheses about the potential effects of the construction of the Maputo ring road: i) As lower costs of living and shorter travel time should attract households and firms, we expect the structure density to increase over time, and to a greater extent in the vicinity of the road than in control areas.
ii) Completion of the ring road in Maputo should lead to higher welfare citywide (not tested here), and especially in areas around the road. It is therefore expected that housing quality-as measured by the share of high-quality roofs within a certain geographical area-increases after completion of the road, and to a greater extent in the vicinity of the road than in control areas.

Data generation
The area of interest studied is illustrated in Figure 1, where the left side illustrates the various stages of the ring road project with an ellipsoid highlighting the areas of interest in this study. The exact study area is defined in the right-hand panel, a rectangle covering about 25 km x 12 km spanning the northern outskirts of the city of Maputo and the large areas of the Matola and Marracuene suburbs. In order to track the city's development, two CNNs were trained to detect 1) the presence of structures, and 2) the type or quality of the detected structures. The CNNs were first developed for the purpose of generating detailed urban poverty maps for urban areas in Mozambique (Fisker et al. 2019a). With minor adjustments, the same algorithms are applied to images before and after the construction of the ring road, and thereby provide evidence of the impact of the road on city development.

Developing CNNs for structure detection
The CNN's ability to recognize structures in images is based on training data. Hence, the CNN is trained by showing it a large number of examples of the object it has to recognize. This paper relies on two separate CNNs: 1) the CNN-detector, which detects structures, and 2) the CNNclassifier, which subsequently categorizes the detected structures from the CNN-detector into different types. The CNN algorithms were trained using Python programming language with Tensorflow and Keras libraries. For both CNNs, the training images are 50 x 50 pixels, which is big enough to cover a typical house and its surroundings. The images for training the CNNs were downloaded from Google Maps API, under Google's 'fair use' agreement, making them free for non-commercial use. The training data is publicly available through Mendeley Data (Fisker et al. 2019b).
The dataset for training the CNN-detector consisted of a total of 107,507 tagged structures from the five largest cities in Mozambique (Maputo, Nampula, Tete, Quelimane, and Beira). Further, tags indicating where there were no structures were also required to train the CNN-detector. For this, 100,000 coordinates, at least 20 metres from the nearest house tag in the dataset, were extracted for the city of Tete. Tete was chosen as it was completely tagged, unlike the other cities, which were only partly tagged (Fisker et al. 2019a).
For the second-stage CNN-classifier, the detected structures were classified into 'apartment', 'nonresidential', 'under construction', 'painted roof', 'grey roof', and 'palm roof' categories. After reviewing the data, the 'apartment' and 'palm roof' categories were found to be unsuitable for CNN training due to under-representation of samples or the tagging quality. For the training of the CNN-classifier, a balanced sample was extracted with approximately equal samples of the 'nonresidential', 'under construction', 'painted roof', and 'grey roof' categories. In this study, we only employ the CNN-detector and the CNN-classifier 'painted roof' and 'under construction' categories.
An out-of-sample evaluation of the performance of both the CNN-detector and CNN-classifier found that the CNN-detector has an accuracy of 97 per cent (evaluated on 10 per cent of training sample excluded) whereas the CNN-classifier has, on average, 67 per cent accuracy (painted roof: 73 per cent, under construction: 69 per cent) (Fisker et al. 2019a).

CNNs applied to the ring road
To analyse the effects of the construction of the Maputo ring road on local housing, we applied the observed relationship between the images and the structure categories obtained in the previous step to new images of the study area depicted in Figure 1.
In order to ensure 1) the highest possible resolution, 2) no gaps or overlaps between images, and 3) minimal manual work, a so-called fishnet grid of 220 cells of around 1,155 m x 1,155 m was created using GIS software. An example of one of these locations for 2010 and 2018 is shown in Figure 2. The somewhat odd cell size corresponds exactly to the area of the images that are downloadable from Google Earth when the resolution is set to the maximum 4,000 x 4,000 pixels at the highest meaningful zoom level. The centre points of the fishnet grid are then exported to KML format and opened in Google Earth, making it is possible to download each image with only a few clicks. The images used in this study had to be manually downloaded because, at the time of writing, Google Maps API only allowed automatized extraction of images by centre coordinates for the most recent images in their archive. This would not have allowed us to compare images going back in time.
From these images, centre coordinates of sub-patches detected as structures were stored and converted to latitude and longitude by approximation from the centre of the cells. We also calculated mean statistics for the whole cells, such as construction density, given as the number of sub-patches detected as containing a structure, divided by the total number of sub-patches in the cell.
Based on the extracted coordinates of each detection, we then created a dataset consisting of 25,252 'pixels' of around 115 m x 115 m, as depicted in Figures 4, 5, and 6. Each pixel contains information on construction density (i.e. share of pixels covered by structures), share of highquality roofs, share of buildings under construction, distance to city centre, distance to ring road, and whether the pixel is located north or south of the ring road. Importantly, the 2010 baseline data was taken around a year before the Government of Mozambique decided to implement the construction project. We therefore consider it likely that the population of the general area were not expecting a road to be built. At the other end of the timescale, in 2018, at least three years had passed since the end of the road construction process (with the exception of the Tchumene intersection), meaning that most of the households and firms that wanted to move to the area had had enough time to realize their intentions. So, while the process goes on and new areas are probably still being populated, most of the ring road's immediate impact on housing can be assumed to have materialized in 2018.

Analysis
Our analysis, based on the data generated by the CNNs, consists of two parts: a descriptive analysis and a regression analysis.

Descriptive analysis
The overall trends in our key variables-density of structures, share of high-quality roofs, and share of buildings under construction-are shown in Figure 3. It shows that construction density across the whole study area increased from 13 per cent in 2010 to 50 per cent in 2018, the share of high-quality roofs increased from 9 per cent to 24 per cent, while the share of buildings under construction decreased from 33 per cent to 26 per cent between 2010 and 2018. Since the last two are shares of the first, this means that the density of high-quality roofs increased far more dramatically than the 15 percentage point increase, and that, while the share of structures under construction decreased, the absolute number of structures under construction increased from 2010 to 2018. Source: authors' own calculations.
Figures 4, 5, and 6 illustrate the spatial distribution of the changes in structures over time. Figure  6 shows that, in 2010, most construction activity took place in the southern area, closer to the city centre, whereas the construction activity in 2018 was more concentrated in the north and central areas of the study area. The most dramatic development between 2000 and 2018 in terms of housing quantity and quality took place in the western half of the study area (Figures 4 and 5). This area officially forms part of the suburb of Matola, which is in a different administrative division to Maputo City, namely the Province of Maputo and Municipality of Matola. However, areas located in the south-east quadrant (and thereby part of Maputo City) were already partly built-up in 2010. The north-eastern quadrant of the maps-belonging to the Province of Maputo and the district of Marracuene-also do not show as strong growth in construction density as the western half.
We have no evidence that the administrative boundaries are key to this, nor do we have any evidence that they are not important for this observed growth.
In Figures 4, 5, and 6, some areas are blank for both years studied. According to Andersen et al. (2015), these are military zones (bottom), state forests (just north of the road in the eastern half), or areas designated for urban agricultural production (far-east side, north-south central stretch).
It is noteworthy for all three city growth indicators (density of structures, share of high-quality roofs, and share of structures under construction) that the road is located in areas of great change, potentially indicating that the road was important for the observed growth. However, while the road is located in areas that have grown, there is also limited indication that the growth is centred around the road itself. None of Figures 4, 5, or 6 clearly show increased structure density, construction activity, or high-quality roofs in the immediate vicinity of the road. As explained earlier, the road is so influential for the city that it is not possible to identify areas that have not been impacted by it. We can only assess whether the road has had a larger impact on areas closer to it than on those further away. To examine this in more detail we turn to our regression analysis.

Regression analysis
To assess the road's impact, we focus on the change in construction density, construction and quality of roofs between 2010 and 2018, as in regression (1): where ∆Yi is the change in structure density, density of high-quality roofs, or density of buildings under construction between 2010 and 2018 in pixel i. dist_cc and dist_road measure the geodesic distance to the city centre and ring road in km respectively. 0 is a constant and is a pixelspecific random error term. The key variable of interest is dist_road and 1 , which captures the change over time in different distances to the road. The distance to the city centre is included as a control variable as we expect this to have a direct influence on the outcome variables beyond and irrespective of the presence of the ring road.
Implementing regression (1), with and without controlling for distance to the city centre, shows that the closer one gets to the road, the higher is the growth in structure density and high-quality roofs, but there are fewer structures under construction (interpretation is reverse of coefficients as our key variable measures distance to road and not proximity to road). This could indicate a positive impact on the density of structures and high-quality buildings close to the road that have already matured, and therefore fewer structures under construction. However, the impact of the road may also vary by the distance to the road. It might be preferable to be close to the road for easy access, but not be right next to it, as this could cause discomfort in terms of e.g. noise and heavy traffic. Figure 7 explores this by estimating the non-linear relationship between changes in our outcome variables of interest and the distance to the road. For growth in density of structure and share of high-quality structures, the relationship is largely linear. For the share under construction, there is a non-linear relationship with a peak in construction growth about 5 km from the road, which could be consistent with the current boundary of the city ( Figure 6). As highlighted in our descriptive analysis, in the study area, the city is growing rapidly from south to north and the ring road is right in the middle. Though growth in the variables of interest is higher in locations closer to the road than further away, the analysis presented thus far is not able to argue that the road-and not just the general growth of the city from south to north-is driving this. One way to argue that growth is being driven by the road, not just by general city expansion, is by analysing the impacts of the road north and south of the city separately. Figures 8 and 9 replicate Figure 7 for pixels located south and north of the ring road, respectively. The figures reveal that the impact of the road is very different in the north compared to the south of the road. The overall results seen in Figure 7 and Table 1 are clearly driven by different subsamples.
The higher growth in structure density and share under construction close to the road is driven by areas north of the road, while the opposite pattern is observed south of the ring road. The observed higher growth in high-quality housing closer to the road, on the other hand, is driven by areas south of the road, while little growth is observed for areas north of the road.
All in all, this points to the road being located in an area that is growing, but there is no indication of any increased growth in the density of structure, quality of structures, or structures under construction in the vicinity of the road that is in addition to the general growth of this part of the city.  We chose to compare images from the years 2010 and 2018 for two reasons. First, these two end years provide a good period for analysis since construction of the ring road commenced shortly after the baseline year of 2010, and 2018 has enough of a margin from the construction for the expected effects to have had time to materialize. Second, good-quality, cloud-free images were available for the two years, taken at around the same time of year, i.e. September for 2010 and May/June for 2018. While not exactly the same months, both periods are in the dry season.
In order to elaborate on the analysis, three extra sets of images were downloaded for the intermediate years of 2012, 2014, and 2016. However, these were not all of the same quality, and variation in the month and the time of day when the images were taken seem to have influenced the number of structures detected by the CNN structure algorithm in particular. This is apparent in Figure 10, which shows an impossible trend over time. Source: authors' own calculations.
By overlaying and visually comparing detections of images, it appears that some years over-detect and other years under-detect depending on (at least) three factors. First, if an image is of lower quality or has a low contrast, it becomes more difficult for the CNN structure detector to disentangle structures and non-structures, leading to under-detection or over-detection. Second, the time of year of the image influences the saturation of vegetation as well as the colour of the soil. In some instances, trees or bushes can be mistaken for buildings. An attempted quick fix of adding around 1,000 examples of trees to the non-house category when training the algorithm was not sufficient to improve the algorithm's ability to distinguish between these. Third, the time of the day when an image is captured may also influence the number of detections the CNN structure generates. If an image is taken in the middle of the day when the sun is 90 degrees above the ground, shadows are generally not visible, but, if the image is taken in the morning or the afternoon, shadows may be even more noticeable than the differences between e.g. a building and a road. When shadows are prevalent, the general outcome is more detections.
When looking at construction density, the overall trend is broken by the years 2012 and 2014 ( Figure 10). For 2012, there is a stark over-detection, which is also visible from manual inspection of the detections (mainly arising from shadows), whereas, for 2014, many structures are not detected due to the poor quality of the images.
The trend in share of high-quality roofs across the five years, is close to linear, which is largely in line with expectations. This indicates that, although the number of structures detected is sensitive to variation in the quality of the images, the classification of detections into different categories seems to work better.
Although our CNN algorithm has been trained using data from five different cities, and thus five different hours and (possibly) months, it is still not at a stage of development where comparisons over time are trustworthy in all cases. A possible solution to the problem requires, as a minimum, much more training data with much more variation in resolution, contrast, vegetation, and shadows. As noted in section 2.1, most of the training data comes from the city of Tete where all the buildings in the city were tagged, thus allowing for construction of a good dataset of nonstructures as well. However, the non-structure category, in particular, clearly needs to be expanded with tags from more cities and many more image types (month, hour, quality) for the comparison of the number of buildings over time to be exact. This will be a topic for future research.

Conclusion
This study documented changes to the quantity and quality of housing in an area of around 12 km x 25 km in the northern outskirts of Greater Maputo, from before to after the completion of a major ring road construction project. Two CNNs were adopted to detect and classify buildings on satellite images for the years 2010 to 2018, accessed via Google Earth. The results show that between 2010 and 2018, overall construction density increased from 13 per cent to 50 per cent; the share of high-quality roofs among detections increased from 9 per cent to 24 per cent, and the share of buildings under construction declined from 33 per cent to 26 per cent. Though areas closer to the ring road had higher growth in structures, a higher share of high-quality structures, and a declining share of structures under construction, we cannot convincingly argue that this was driven by proximity to the road rather than just the road being located in a part of town that was growing rapidly. We come to this conclusion as the impact of the road is very different north and south of the city, while there is a consistent pattern of the city growing from the south toward the north.