Finding Intersections with Python

From Bellman
Jump to navigation Jump to search

Where roads meet

We need to be able to build a table of all the places roads meet.

  • This is not a page about the general case of geometric intersections. Road intersections.
  • And we don't want to find the places that roads CROSS (over- or under-crossings). Just where they join.

Both of these points are salient because they mean dealing with dirty input data. Always the case.

  • Roads should always meet at a vertex
  • Roads that meet should always have a common vertex, not just be "close".
  • Roads that do not meet (crossings) should NOT meet these conditions.
    • Any gaps should indicate no connection.
    • Any place roads touch but not at a vertex should indicate a crossing.

But none of this is ever true! Alas, we always get data that looks good on a map but is not routable.

To this end, I want a tool that:

  • Generates a point layer of intersections
  • Generates a point layer of error cases

The tools

I have a version of this program already written in C# using ArcObjects, and it's a BIG MESS. Worst of all, every time I do a build in a new release of ArcGIS, it breaks!

I want something easy to maintain, easy to update, easy to run, and reasonably fast.

To accomplish this I am reading Python Geospatial Development by Erik Westra. Packt Pubs

I've been working a lot with PostGIS lately and it would be ideal for this project. HOWEVER I need to be able to give it to some ESRI users and I don't want to teach them how to use PostGIS as part of that project. This means I will be using Shapely.

I want to be able to use File Geodatabases, so I will be using a custom built GDAL with Python bindings on Linux and the OSGeo4W version on Windows.


Here is a page where I talk about Building GDAL on Linux. I guess I should also do a write up on Using OSGeo4W but I have not done that yet.


Here is the home of Shapely, where it is described as "PostGIS-ish operations outside a database context for Pythoneers and Pythonistas." Sounds like what I need.

The steps

Starting off with the simplest version, ignoring all the data cleaning.

  1. Read centerline data with OGR (part of GDAL).
  2. Find all the crossings with Shapely.
  3. Write points with appropriate attributes (FENAME1, FENAME2)

With the tools at our disposal, this should be so easy! It turns out that it is!.

I was able to write a script in Python and develop it on a Linux system, and the first time I ran it on Windows, it ran successfully to completion with NO CHANGES. Awesome

The next part now will be getting it to run in ESRI Model Builder.

Explore data from command line using ogrinfo.

ogrinfo -so cntrline.shp cntrline
Gives lots of info I won't include here

Read shapes into Shapely geometries

Time to code up some python. I want to be able to read ESRI File Geodatabases too. And write them.

class InputFeatures:
    # Input fields
    input_fieldname = "FULLNAME"

    # Stubby little streets less than this are ignored.
    MinLineLength = 3.0
    FeatureList = [] # tuple containing (hashed name, ogr feature, shapely geom)
    SpatialReference = None
    BadNameCount = 0
    EmptyNameCount = 0
    BadGeometryCount = 0
    DebugFeatureCount = 0 # Set to some value to load a few features only -- good for testing
    _d_ignored_names = {} # ALLEY, UNKNOWN...

    def __init__(self, input_fieldname, ignored_path, st_types_path):
        self.input_fieldname = input_fieldname
        icount = self._load_ignored_names(ignored_path)
        if icount == 0:
            print "Could not load ignored names file '%s'" % ignored_path

        self.condenser = streetnames.Condense()
        ccount = self.condenser.LoadStreetTypes(st_types_path)    
        if ccount == 0:
            print "Could not load street types file '%s'" % st_types_path

    def LoadFeatures(self, fcname):
        """ Load everything from the input feature layer into memory.
            Well.. not EVERYTHING. Ignores meaningless features. """
        feature_count = 0
        (ogr_ds, ogr_layer) = self._open_layer(fcname)
        if not ogr_layer: 
            print "Can't open input layer '%s'" % fcname
            feature_count = self._loadFeatures(ogr_layer)  
            self.SpatialReference = ogr_layer.GetSpatialRef()
        return feature_count

    def _open_layer(self, fcname):
        (workspacename, layername) = fc_to_ogrname(fcname)
        ds = None
        layer = None
            ds = ogr.Open(workspacename);
            layer = None
            if ds:
                layer = ds.GetLayerByName(layername)
        except Exception:
            print "Could not open input feature class '%s" % fcname
        return (ds, layer)

    def _loadFeatures(self, layer):
        max_feature_count = layer.GetFeatureCount()
        debug_count = self.DebugFeatureCount
        if (debug_count > 0) and (debug_count < max_feature_count):
            max_feature_count = debug_count

        for i in range(max_feature_count):
            if progress and not i % 1000: print i, "..",
            feature = layer.GetFeature(i)
            if not feature:
                self.BadGeometryCount += 1
            name = feature.GetField(self.input_fieldname)
            if not name:
                self.EmptyNameCount += 1

            # Ignore all streets with names like ALLEY and UNKNOWN
            if self._d_ignored_names.has_key(name):
                self._d_ignored_names[name] += 1 # count each one to help debugging
                self.BadNameCount += 1;

                # Convert OGR geometry to Shapely geometry
                geometry = feature.GetGeometryRef()
                shgeom = shapely.wkt.loads(geometry.ExportToWkt())
                print "%d : %s (EXCEPTION)" % (i, name)
                self.BadGeometryCount += 1
            msg = self._featureIsValid(shgeom)
            if msg :
                print "%d : %s : %s" % (i, name, msg)
                self.BadGeometryCount += 1
            # Geometry is good and name is good, so add it to our list
           # Make a condensed name from the FULLNAME field.

            # I wonder what would happen if I actually used hash() here
            hashed = self.condenser.HashStreetName(name)
            self.FeatureList.append( (hashed, feature, shgeom) )

        return len(self.FeatureList)
    def _featureIsValid(self, shgeom):
        """ Return True if this feature passes validity tests. """
        if not shgeom.is_valid:
            return "Geometry is not valid."
        if shgeom.is_empty:
            return "Geometry is empty."
        mp = list(shgeom.boundary.geoms)
        if len(mp) == 0:
            return "Geometry is empty!"
        if shgeom.length < self.MinLineLength:
            return "Short line ignored."
        return None

def isshapefilename(fcname):
    if fcname[-4:].lower() == '.shp':
        return True
    return False

def fc_to_ogrname(fcname):
    """ Convert an ESRI feature class name into an OGR workspace and layername """
        if isshapefilename(fcname):
            # We're working with a shapefile.
            # Don't test for existence because we might be making a new shapefile.
            workspacename = fcname
            layername = os.path.basename(fcname)[:-4]
            return (workspacename, layername)
    except Exception:
        directory = os.path.dirname(fcname)
        if directory[-4:].lower() == '.gdb':
            # We're working with a file geodatabase
            if not os.path.exists(directory):
                print "File geodatabase '%s' does not exist" % directory
                return (None, None)
            workspacename = directory
            layername = os.path.basename(fcname)
            return (workspacename, layername)
    except Exception:

    print "Something is wrong with feature class '%s'" % fcname
    return (None,None)

2. Use Shapely to find the intersections.

First I find where lines cross.

Then I look at the endpoints of each line segment.

class GenerateCrossPoints:
    # Units depend on projection of data source, usually in feet.
    BufferDistance = 3.0

    # Trim pre-directionals from output street names when set true
    trim_predirectionals = True 

    # Input 
    input_fieldname = "FULLNAME"
    featureList = None

    # Dictionaries for fast look up tables.
    d_street_types = {} # ST AV BLVD
    d_street_translations = {}  # stuff like first => 1

    # OUTPUT
    # Has to be a list because of dupe names + modifier (1st and MAIN 1, 1st and MAIN 2,...)
    IntersectionList = [] # Tuple (geom, (hash1, hash2), fename1, fename2, modifer, rating)
    ErrorList = []        # Tuple (geom, fename1, fename2, errdesc)
    # used to differentiate between intersections with same name
    _d_modifier = {}      # [hashed names] => counter count instances of each name

    # rating, lower is better
    # 0 = vertex at line crossing
    # 1 = endpoint touches a vertex
    # 2 = crossing
    # 3 = endpoint touches a line
    # 4 = line is near vertex
    # 5 = endpoint is near a line

    # Regular expressions, compiled ahead of need

    # Used for trimming predirectionals
    re_predir  = re.compile(r'^(N|S|W|W|NE|NW|SE|SW)\s(.*)$', re.IGNORECASE)


    def __init__(self, input_fieldname):
        self.input_fieldname = input_fieldname
    def FindIntersections(self, featureList):
        self.featureList = featureList
        self._findLineIntersections()      # where lines cross
        self._findEndpointIntersections()  # where endpoints meet at vertices

    def _getVertices(self, geom, points):
        """ Given a shapely geometry (line or multi)
            Return a list of the points in that object """
        if geom.type == "MultiLineString":
            for line in geom.geoms:
                self._getVertices(line, points)
            for p in geom.coords:
    def _adjustName(self, name):
        """ Adjust the fename according to trim flag. """
        if self.trim_predirectionals:
            mo =
            if mo:
        return name
    def _proximate(self, keytuple, pt):
        """ Return True if the named pt is close to an existing intersection with the same name. """

        for (ex_pt, ex_keytuple, ex_fename1, ex_fename2, modifier, rating) in self.IntersectionList:
            if keytuple == ex_keytuple:
                # Has same name, is it nearby?
                buffered = pt.buffer(self.BufferDistance)
                if ex_pt.within(buffered):
                    if verbose:
                        print "Nearby + same name = ignored '%s x %s'" % keytuple
                    return True
        return False
    def _add_intersection(self, pt, keytuple, fename1, fename2, rating):
        """ Given geometry for a point, condensed name pair, full name pair, and rating,
        check to see if this is already in our list of intersections
        if not, then add it. """
        # Don't add if it already exists
        # 1ST x MAIN
        if (keytuple in self._d_modifier):
            if self._proximate(keytuple, pt): return;
        swapped = (keytuple[1], keytuple[0])  # MAIN x 1ST
        if (swapped in self._d_modifier):
            if self._proximate(swapped, pt): return;
            #print "Swapping names to remove duplication"
            t = fename1
            fename1 = fename2
            fename2 = t
            keytuple = swapped
        # Is there already an intersection with this name?
        if self._d_modifier.has_key(keytuple):
            modifier = self._d_modifier[keytuple]
            self._d_modifier[keytuple] += 1
            self._d_modifier[keytuple] = 1
            modifier = None
        # This should be the only place we need to trim the 2 street names, just before storing them.
        self.IntersectionList.append((pt, keytuple, self._adjustName(fename1), self._adjustName(fename2), modifier, rating))
    def _findLineIntersections(self):
        print "Looking for line intersections"
        i = test_count = hit_count = 0
        for (condensedname, feature, shgeom) in self.featureList:
            for (othercondensedname, otherfeature, othergeom) in self.featureList:
                if progress and not i % 1000:
                    print i,
                    if not i % 10000: print
                i += 1
                test_count += 1
                if condensedname == othercondensedname:
                    # no self intersections allowed
                if shgeom.intersects(othergeom) :
                    pt = shgeom.intersection(othergeom)
                    hit_count += 1
                    # two lines cross
                    rating = 2
                    # is there a vertex here?
                    # or is there a vertex NEAR here?
                    vertices = []
                    self._getVertices(othergeom, vertices)
                    buffered = shgeom.buffer(self.BufferDistance)
                    for vertex in vertices:
                        if shgeom.equals(vertex):
                            rating = 0 # vertex at line crossing
                        if vertex.within(buffered):
                            rating = 4 # line is near vertex
                    fename1 = feature.GetField(self.input_fieldname) 
                    fename2 = otherfeature.GetField(self.input_fieldname)
                    keytuple = (condensedname, othercondensedname)
                    self._add_intersection(pt, keytuple, fename1, fename2, rating)
                    if verbose: print "Hit! '%s x %s'" % keytuple
        if progress: print
        return hit_count
    def _findEndpointIntersections(self):
        """ This should be run after generating line intersections.
            It checks the endpoints of every line. If there is an intersection
            there already, fine. Otherwise see if there is a line nearby and
            if so, add it. And endpoint is near but not touching a line. """
        print "Looking at endpoints.."  
        i = 0
        for (condensedname, feature, geom) in self.featureList:
            i += 1
            #if progress: print "Checking endpoints for %s" % (condensedname)
            fename1 = feature.GetField(self.input_fieldname)
            endpoints = list(geom.boundary.geoms)
            self._process_linestring_endpoint(condensedname, fename1, endpoints[0])
            self._process_linestring_endpoint(condensedname, fename1, endpoints[1])
    def _process_linestring_endpoint(self, condensedname, fename1, endpoint) :
        # If there is already an intersection here,
        for (i_point, keytuple, ex_fename1, ex_fename2, modifier, rating) in self.IntersectionList:
            if endpoint.equals(i_point):
                # We're good already. Not an error
    #            if verbose: print "Endpoint already matched '%s'" % hashkey
        # else
        #   is there a line nearby?
        buffered = endpoint.buffer(self.BufferDistance)
        test_count = hit_count = 0
        for (othercondensedname, otherfeature, othergeom) in self.featureList:
            test_count += 1
            if condensedname == othercondensedname:
                # no self intersection allowed
            # Are we touching
            elif endpoint.intersects(othergeom) :
                wkt = shapely.wkt.dumps(endpoint)
                hit_count += 1
                fename2 = otherfeature.GetField(self.input_fieldname)
                rating = 3 # endpoint actually touches a line
                keytuple = (condensedname, othercondensedname)
                self._add_intersection(endpoint, keytuple, fename1, fename2, rating)            
                if verbose: print "Endpoint touches line '%s x %s'" % keytuple
            # Are we not quite touching
            elif buffered.intersects(othergeom) :
                wkt = shapely.wkt.dumps(endpoint)
                hit_count += 1
                fename2 = otherfeature.GetField(self.input_fieldname)
                rating = 5 # endpoint is near a line
                keytuple = (condensedname, othercondensedname)
                self._add_intersection(endpoint, keytuple, fename1, fename2, rating)            
                if verbose: print "Endpoint near line '%s x %s'" % keytuple
        return hit_count

# end of class definition

3. Write intersection points to a feature class

This code can write to either a shapefile or a geodatabase. It looks at the string you pass in and decides. I use the ESRI concept of a "feature class" to make it easier to run this script from ESRI Model Builder.

  • Shapefiles end in .shp for example "foo/bar/intersections.shp"
  • Geodatabases have a layername at the end and a GDB in them, for example "goo/gump/features.gdb/intersections".
def createOutputLayer(fcname, sref):
    (ogr_ds, ogr_layer) = create_layer(fcname, sref)
        # Add the attribute fields

    fieldDefn = ogr.FieldDefn(fename1_fieldname, ogr.OFTString);
    fieldDefn = ogr.FieldDefn(fename2_fieldname, ogr.OFTString);
    fieldDefn = ogr.FieldDefn(modifier_fieldname, ogr.OFTString);
    fieldDefn = ogr.FieldDefn(rating_fieldname, ogr.OFTInteger);
    return (ogr_ds, ogr_layer)

def create_layer(fcname, sref):
    """ Create a layer for output """
    (ds_name, layer_name) = fc_to_ogrname(fcname)
    if isshapefilename(fcname):
        driver = ogr.GetDriverByName("ESRI Shapefile")
        if os.path.exists(ds_name):
        ogr_ds = driver.CreateDataSource(ds_name)
        driver = ogr.GetDriverByName("FileGDB")
        if os.path.exists(ds_name):
            ogr_ds = driver.Open(ds_name)
            rval = ogr_ds.DeleteLayer(layer_name)
            ogr_ds = driver.CreateDataSource(ds_name)
    if ogr_ds is None:
        print "Create layer '%s' failed" % ds_name
        return (None, None)
    ogr_layer = ogr_ds.CreateLayer(layer_name, sref, ogr.wkbPoint)
    fieldDefn = ogr.FieldDefn(ID_FIELDNAME, ogr.OFTInteger)

    return (ogr_ds, ogr_layer)

def saveIntersections(ogr_layer, intersectionList):
    global fename1_fieldname, fename2_fieldname, modifier_fieldname
    featureDefn = ogr_layer.GetLayerDefn()

    not_a_point = 0
    fid = 0
    for (p, keytuple, fename1, fename2, modifier, rating) in intersectionList:

        if p.type == "Point":
            wkt = shapely.wkt.dumps(p)
            feature = ogr.Feature(featureDefn)

            fid += 1
            feature.SetField("id", fid);
            feature.SetField(fename1_fieldname, fename1)
            feature.SetField(fename2_fieldname, fename2)
            feature.SetField(rating_fieldname, rating)
            feature.SetField(modifier_fieldname, modifier) 
            # Now create a second feature with the streetnames reversed.
            fid += 1
            feature.SetField("id", fid);
            feature.SetField(fename1_fieldname, fename2)
            feature.SetField(fename2_fieldname, fename1)
            feature.SetField(modifier_fieldname, modifier)
            not_a_point += 1
    return fid