Using tax data to inform economic policy in South Africa

Countries need data and evidence to create, amend, and evaluate policy. Indeed, one of the key mandates of policy makers in South Africa is to make evidence-based policy decisions which improve the policy landscape of the country.

The South African Revenue Service (SARS), the National Treasury (NT) and UNU-WIDER collaborated under the Regional growth and development in Southern Africa project to use administrative tax data from SARS for economic policy analysis placing South Africa at the vanguard of big data research for development.

The data has opened avenues for research in areas and topics that were not possible before in South Africa. First, the data permit analyses at the level of the firm, rather than sector, allowing for more detailed and nuanced insights. Second, the data provides both snapshots and longer-term views of the entire formal workforce allowing for analyses of the labour market with greater credibility. Third, the level of detail available in the tax data allows for the examination of existing policy issues from new angles and perspectives.

Over the past seven years, the tax data has been used to examine the employment tax incentive, minimum wage, returns to the R&D incentive, wealth inequality, and how small businesses respond to corporate tax increases. 

How did it happen?

The tax data project started small in 2014; the data lab consisted of three donated computers and a handful of data scientists and researchers working tirelessly to prepare and make the data available. Over the following three years, the local and international research communities took note of the data access in Pretoria, saw the potentially large benefits for policy research, and so the demand for access to the data increased.

In 2018, as part of the Southern African – Toward Inclusive Economic Development programme, full-time staff were hired to further develop the tax data project. This led to a large increase in data usage, new datasets being made available for research, and a steady improvement in the quality of data.

The National Treasury Secure Data Facility (NT-SDF) was opened in early 2019, doubling the number of people simultaneously accessing the tax data and exponentially increasing the processing power. This also gave rise to several technical training courses, capacity-building of young graduates, and in the process improved quality of the data and documentation thereof.

What did we learn?

First, the experience on the tax data project has shown that it is possible to make sensitive tax data available for research without compromising the anonymity of firms or individuals. Managing the confidentiality of the data remains constant and an important part of running a secure data centre This bodes well for the possibility of using other administrative datasets in the future and initiating similar tax projects in other countries.

Second, a project of this nature requires patience and perseverance. Not everyone sees the value in tax data for research — legislation around data is slow to catch up with the reality and government departments sometimes need support to develop and implement such initiatives.

Third, building institutional capacity is crucial for the sustainability of a secure data centre. Technical capacity is crucial to analyse the tax data and so are those with the ability to manage and run such world-class facilities.

What next?

In the short term it is expected that there will be further improvements to the quality of the data available. Automation in tax form completion will likely translate into fewer data errors and thus more reliable data.

Researchers can expect to see new and additional guides available — a typical shortcoming of administrative data is a lack of this type of documentation. Statistics South Africa (Stats SA) recently agreed to host the metadata and it is expected that this collaboration will grow.

The project has also revealed a clear demand for training and capacity-building on the use of tax data and research methods. The next phase will hope to develop a training course based on the tax data, further increasing the usefulness and accessibility of the data.

In the medium to long term there are plans to include new sources of administrative data that can further deepen the scope of research possibilities. The collaboration with SARS, NT, and UNU-WIDER has shown that this can be done, and that the benefits of doing so in terms of informing evidence-based policy-making is large.

It has become clear that the NT-SDF and the tax data are both invaluable resources with a bright future ahead. The lab has served as a crucial meeting point for researchers and policy makers to discuss ideas, interrogate assumptions, and an in some cases initiate their own research or policy evaluations.

For many reasons, exploitation of tax data has become the global best practice in some avenues of research, and South Africa can be commended for mounting a serious effort to employ data for the purposes of policy analysis.

It will also become possible to investigate the economy-wide impact of the coronavirus pandemic on firms and jobs as the data becomes available, as well as the policy levers which could best be utilised — a clear and pressing concern for the future of research on the tax administrative data.