
The Potential Value of a National Geospatial Database
National Geospatial Database Proceedings of the NGD Seminar 5 June 1996
David Rhind (Director General, Ordnance Survey)
The National Geospatial Database (NGD) is envisaged as the totality of
many individual data sets, collected and held separately by many different
organisations. To be part of the NGD these data sets must have the following
characteristics:
- they are geospatial, i.e. contain data that is geospatially referenced
in some way (either directly or indirectly);
- be 'known about', in that a potential user must not only know of the
existence of the data set but also have access to some information about
its provenance and quality;
- be accessible, requiring the terms, methods and conditions of access
to the data to be publicly available;
- be capable of linkage with other data since implicit in the concept
of the NGD is the idea of data combination.
The NGD has never been viewed as a single database nor can it be owned
by any one body. Rather it is seen as an initiative that will help to facilitate
the wider, multiple use of existing data sets.
At present, the NGD is about geospatial data in the UK, with an initial
focus on data gathered using public funds. It has been estimated that the
tax payer spends no less than £100 million a year on geospatial data,
but the exact figures are difficult to determine and could be much higher.
The collection of this data is uncoordinated, with data often being captured
for a single specific purpose and often not made more generally available
thereafter.
Clear distinctions can be made between the complementary initiatives
of the NGD and the National Land Information Service (NLIS) and ScotLIS.
The latter two are a service within the framework of the NGD; they
aim to provide end-users with a tangible end-product via an application
that links and manipulates geospatial data taken from several independent
sources. The NGD, on the other hand, is seen as being data-centric
by providing the data baseline that enables services such as the NLIS and
ScotLIS to identify and combine underlying data sets with greater ease.
The potential benefits of the NGD are seen as:
- Reducing the cost of data collection through a reduction in different
organisations collecting the same data independently. Cataloguing what
exists is an important first step. The Spatial Information Enquiry Service
(SINES) run by Ordnance Survey on behalf of the Inter-departmental Group
on Geographic Information (IGGI) goes some way towards achieving this,
although SINES is far from perfect.
- Improving the quality and consistency of the data. The mere act of
combining two data sets will reveal inconsistencies, errors and omissions,
which will force correction and improvement. For example, when Ordnance
Survey attempted to combine its own data on building locations with the
Royal Mail's Postcode Address File (PAF®), inconsistencies were found
which subsequently led to improvements in both data sets. Similarly, CORINE,
a European Commission project, showed that the major apparent changes in
some environmental parameters occur along national boundaries! This is
indicative of the different data collection methods and standards employed
by the individual countries, rather than a genuine natural phenomenon.
- Deriving added value through combining data sets. The more data sets
that can be combined, the greater the potential for added value and the
greater the potential number of end-user applications. This is already
happening through projects such as the NLIS, where data from HM Land Registry,
Ordnance Survey, Valuation Office and Bristol City Council are being combined
with intention of providing direct access to the underlying databases.
The British Geological Survey is providing information on hazards through
combining their own data sets on lithology and physical characteristics
of 'soils' with data from the Royal Mail, Ordnance Survey and insurance
companies. The list of potential combinations is endless.
- Improved access to data.
- Improved decision making may come through the proper use of combining
data sets to reveal a more accurate picture. Clear benefits are evident
from real experience: the speech by the US Secretary of State for the Interior,
Bruce Babbitt, on modelling floods below Glen Canyon Dam is a good example
(Annex 5).
There are many challenges to bringing about the NGD. It is not obvious
that data owners will collaborate. There are many benefits in them doing
so but also responsibilities that they may not wish to have and costs they
may be unable or unwilling to bear. Standards and best practices will need
to be defined and adopted, which may be difficult for existing data sets
with all the implications of re-engineering. There are many technical problems
inherent in the data collected by different bodies such as differences in
levels of detail and classification.
Lastly, it is important to remain focused. There are dangers in trying
to be too ambitious for available resources. It can be argued that the US
National Spatial Data Infrastructure is suffering from such overambition.
In conclusion, linking geospatial data offers great potential. Sometimes
it will be easy to do this and the technical problems may be small. But
more often than not, it will be all too easy to create nonsense by using
data sets that are not capable of being sensibly combined. Reliability can
only be achieved through the creation of a single framework, adherence to
common standards and best practice, good documentation of the data and a
well-educated set of users.
© Copyright 1998. NGDF