MapFactor OSM Data Processing
  • Hello OSM mappers,
    I would like to start this thread as presentation of issues MapFactor is dealing with during OpenStreetMap data processing. I would be a place, where I could "complain" (and not you this time ;-)...

    Roundabouts
    MapFactor expects that roundabouts are closed circles. If it is not at least navigation is confused. In the worse case it was even causing crashes. Now application (Android 1.0.57+) is more robust, but the non-sense navigation remains. It looks like that in every country there are some bad examples.

    Street names
    On the same place in Poland OSM (2013.11 data) was even crashing text placing. It was caused by extra spaces at the beginning of the words. It is fixed now (Android 1.0.58+), but it is probably bug also in source data? More over it looks like the spaces were inserted to improve (?) textplacing on the web application (??). Finally even the text is split among several roads - note, that in that case you will not be able to find the street in search.

  • 86 Comments sorted by
  • We never complain, we only give constructive hints ;-))))

    Good topic, anyway! :-) It would be even more helpful to provide OSM links of the issues you have found so that we could fix them right away.

  • ^^^yes, can you visualize the problems on a map along with a "edit in JOSM" link? I am pretty sure it will be fixed soon
  • :) OK, that's why I put "complain" in quotation marks ...
    randomly picked (and filtered duplicities) example:
    2013-12-09 12:59:45,647 - INFO - STARTED['./fix_roundabout.py', 'localhost', 'osm_usa_florida_2013_12_mff']
    2013-12-09 12:59:46,097 - WARNING - cleaning flag for 10961849
    2013-12-09 12:59:46,105 - WARNING - cleaning flag for 160439677
    2013-12-09 12:59:46,105 - WARNING - cleaning flag for 228767756
    2013-12-09 12:59:46,113 - WARNING - cleaning flag for 10964183
    2013-12-09 12:59:46,114 - WARNING - cleaning flag for 163601784
    2013-12-09 12:59:48,651 - INFO - FINISHED['./fix_roundabout.py', 'localhost', 'osm_usa_florida_2013_12_mff']
    .. these are original way IDs
  • OT somebody asked about Android version 1.0.58 - the link:
       http://download.mapfactor.com/mapfactor_navigator_1_0_x.apk
    was updated to 1.0.59 ...
  • Unfortunately there will always be some errors in the OSM data, that is simple the nature of crowd sourced data, so all applications will need to be robust to various strange taggings and mistakes. Everything (tagging) that can go wrong, probably is wrong somewhere in the huge OSM dataset. But as long as the overall proportion is sufficiently low, hopefully that can be tolerated.

    That said, it is always good to get feedback from users / developers to see which issues particularly effect the practical use of OSM and where efforts on improving the quality of the dataset is best spent.

    The key to improving a specific aspect of the dataset is nearly always to come up with good visualisations of where the issues are and create easy links to allow to fix them.

    So it would be great, if as part of your conversion process, you could generate a list of all effected ways with editor links and publish this information to the OSM community.

    Even better would be, if you could help integrate those issue reports into the existing QA tools like "OSM inspector" or keepright. A good example for that might be the "routing" layer in OSM inspector, that if I am not mistaken was originally developed on behalf of Skobler to help fix some of the main issues they identified for their use.
  • I have fixed the ways you mentioned for now, but that presumably is only a tiny fraction of ways effected.

    Interesting is that apart from the last way, non of the others were even close to being a roundabout. So I do wonder a little of how the junction=roundabout tag got onto those ways.

    The last way is an example of a "typical" error I would expect happens not that seldom. A roundabout is correctly tagged as junction = roundabout, but for some reason it isn't closed around the full circle. E.g. because the way was split, or it was "built" from multiple ways and not all parts have the junction = roundabout tag attached.

    With regard to spurious spaces, I think there are some editors that actually automatically filter out extra spaces before they save the data to reduce the issue. Also I think there was a bot at some point to fix tags with some obvious common mistakes like spurious spaces. Not sure if those would help your issues. But the use of bots is often seen very critical, as they don't have the necessary intelligence to distinguish between mistakes and deliberate non standard use. So for everything but trivial mistakes they are generally not acceptable.
  • Hi apmon,
    thanks for your comments. Just list of ways would be enough, or some simple specific format would be more suitable? It can be probably generated every month after/during data processing.
    thanks
       Martin
  • p.s. in ideal case I would like to write/integrate small (Python) script, which would automatically report error during data processing. It has to handle duplicities and false errors - say if you have roundabout of roundabouts and it is in reality not single circle and it is correct (from the OpenStreetMap rules point of view), you do not want to see this report every month. Also hopefully closed roundabouts and spaces at the beginning/end of names would be just first step, where is high probability that it is mistake. Later can be integrated checks with turn restrictions, lanes, etc.

    Is there some API we can already use?
  • 2013-12-13 00:07:06,691 - INFO - STARTED['./fix_roundabout.py', 'localhost', 'osm_germany_2013_12_north_mff']
    2013-12-13 00:07:07,429 - WARNING - cleaning flag for 246480890
    2013-12-13 00:07:07,492 - WARNING - cleaning flag for 225658989
    2013-12-13 00:07:07,518 - WARNING - cleaning flag for 225658991
    2013-12-13 00:07:07,537 - WARNING - cleaning flag for 99705550
    2013-12-13 00:07:08,238 - WARNING - cleaning flag for 227580983
    2013-12-13 00:07:08,239 - WARNING - cleaning flag for 227578418
    2013-12-13 00:07:09,231 - WARNING - cleaning flag for 28534690
    2013-12-13 00:07:10,037 - WARNING - cleaning flag for 180099180
    2013-12-13 00:07:10,755 - INFO - FINISHED['./fix_roundabout.py', 'localhost', 'osm_germany_2013_12_north_mff']

  • I have no idea how to search for locations with that data ...
  • 2013-12-13 00:07:07,429 - WARNING - cleaning flag for 246480890
    ... the number is way ID, so you can type in browser:
    http://www.openstreetmap.org/way/246480890
    or in JOSM download object "way" with "ID" set to 246480890 ...


  • Whereas the first batch of issues reported in Florida where clear errors in the data and easy to fix, the second batch of issues in Germany don't seem to be as clear cut. In fact I am not sure if they are wrong at all, or if this is something the converter needs to be able to deal with.

    It seems like a bunch of those are where round-abouts are split into multiple ways, but in combination, they do form a closed circle. Due to the way relations work (i.e. only on full ways), it is not always possible to guarantee that a round about consists of a single closed way. E.g. if there is a bus route going via the round-about, then you need to split the round-about way into multiple segments, only some of which will be part of the bus relation. Equally, if you were to add turn restrictions, you would need to split the round-about into multiple ways. So the converter will need to be able to reassemble all those fragments.

    Way 246480890 is also an interesting case. By the looks of it this is where a t-junction is being turned into a round-about, but still under construction. So part of the way is being build new and still flagged as highway=construction. This section already has the junction=roundabout tag. The other half seems to still be the old way of the t-junction and is thus not yet marked as junction=roundabout as it presumably isn't complete yet (Only someone local will be able to tell when the construction is finished). So the navigator should in this stage likely just ignore the roundabout, as the tag is only on a highway=construction way, which should presumably just be ignored all together for routing as it isn't finished yet.

  • OK - at the moment (processing 2013.12 OSM data) the flag roundabout is automatically cleared on the listed ways. Also if roundabout is split into several ways it is not problem - we are looking for nodes, which coincide to only one way with set roundabout attribute (3 or more is also bad, but it is harder to automatically resolve).
  • OK, sounds good. Perhaps one way to proceed would be that, (once done) you publish the full list of ways that were flagged as problematic by your converter. Which format probably doesn't matter for now, as long as it is reasonably easily human readable and parsable. Then people can have a look to judge how good a quality these error reports are. If they hopefully turn out to be of high quality. I.e. low number of false positives, then we can work on trying to get them to a bigger audience to help make sure they actually get fixed. 
  • Hi all,
    there is a simple script for extraction of roundabout issues form processing logs now. It is public at
       https://github.com/mapfactor/osm-tools/blob/master/log2html.py
    just to get started, somehow. I hope that as soon as we conclude that this info is useful it will be integrated directly into data conversion process (i.e. it won't be only log parsing). An example of output for one processing machine is available here:
       http://download.mapfactor.com/osm_data_report_2013_12.html
    regards
       Martin

  • The OSM wiki in German for roundabouts says http://wiki.openstreetmap.org/wiki/DE:Tag:junction%3Droundabout that a roundabout should be one closed circle. A bus route should cover all the roundabout, even if the bus only goes part of the circle.

    The English wiki says to draw a closed circle as well http://wiki.openstreetmap.org/wiki/Tag:junction%3Droundabout
  • As I wrote sooner - for us it does not matter if roundabout is split into several ways as long as they together create one closed circle.
  • Thanks for the data report. I have gone through a whole bunch of them now and fixed some of them.

    In a good fraction of the cases, junction=roundabout was simply wrong. E.g. it was a long straight way that under no definition could be considered a roundabout, so I simply deleted those tags there.

    In a couple of cases (e.g. http://www.openstreetmap.org/way/6055699 ) I think there might be issues with the conversion as half of the round-about is in one state (e.g. Washington D.C.) and the other half in another ( e.g. Maryland), but the data itself looks fine.

    There were also a couple of cases, which mostly look like a round-about, but looking at the satelite, it appears as if the roundabout is indeed not fully closed. (e.g. http://www.openstreetmap.org/way/177926557 )


    Although there were very few cases where there was a real roundabout with a fixable data error (i.e. were simply deleting the junction=roundabout isn't the correct solution), those lists are probably still pretty useful.
  • OK thanks - I can confirm that roundabouts on border of two countries/states/regions (for split countries Germany and France now) will be reported wrongly. I can add there extra condition for it.
  • Region/Country administrative polygons
    In order to properly split whole planet into states/regions/countries we need geometrically valid multipolygons. We used backup borders which were applied in case of "invalid boundary", but ... then you can have two neighbor countries with slightly shifted borders and that was/is causing problems with routing (for example motorway between Czech Republic and Germany was cut into two separate pieces).

    [I am adding this note, at france_osm_east would not be converted for second month and connectivity with switzerland_osm can be broken (changeset)]
  • Here is a list of boundary problems for planet 2013/12:
    nohup_2013_12/bangladesh_osm1.out:Original boundary not valid!!!
    nohup_2013_12/botswana_osm1.out:Original boundary not valid!!!
    nohup_2013_12/cameroon_osm1.out:Original boundary not valid!!!
    nohup_2013_12/chad_osm1.out:Original boundary not valid!!!
    nohup_2013_12/crozet_islands_osm1.out:Original boundary not valid!!!
    nohup_2013_12/cyprus_osm1.out:Original boundary not valid!!!
    nohup_2013_12/el_salvador_osm1.out:Original boundary not valid!!!
    nohup_2013_12/equatorial_guinea_osm1.out:Original boundary not valid!!!
    nohup_2013_12/eritrea_osm1.out:Original boundary not valid!!!
    nohup_2013_12/ethiopia_osm1.out:Original boundary not valid!!!
    nohup_2013_12/germany_osm_north1.out:[None, None, <shapely.geometry.multipolygon.MultiPolygon object at 0x1272d50>, None, None]Original boundary not valid!!!
    nohup_2013_12/guatemala_osm1.out:Original boundary not valid!!!
    nohup_2013_12/mauritania_osm1.out:Original boundary not valid!!!
    nohup_2013_12/morocco_osm1.out:Original boundary not valid!!!
    nohup_2013_12/namibia_osm1.out:Original boundary not valid!!!
    nohup_2013_12/new_caledonia_osm1.out:Original boundary not valid!!!
    nohup_2013_12/senegal_osm1.out:Original boundary not valid!!!
    nohup_2013_12/south_africa_osm1.out:Original boundary not valid!!!
    nohup_2013_12/usa_pennsylvania_osm1.out:Original boundary not valid!!!
    nohup_2013_12/venezuela_osm1.out:Original boundary not valid!!!
    nohup_2013_12/western_sahara_osm1.out:Original boundary not valid!!!
    nohup_2013_12/zambia_osm1.out:Original boundary not valid!!!

    nohup_1312/belgium_osm1.out:Original boundary not valid!!!
    nohup_1312/france_osm_east1.out:[None, <shapely.geometry.multipolygon.MultiPolygon object at 0x10dbfd0>, <shapely.geometry.polygon.Polygon object at 0xe0a4190>, <shapely.geometry.polygon.Polygon object at 0xe0a41d0>, <shapely.geometry.polygon.Polygon object at 0xe0a4290>, <shapely.geometry.polygon.Polygon object at 0xe0a42d0>]Original boundary not valid!!!
    nohup_1312/france_osm_north1.out:[None, <shapely.geometry.multipolygon.MultiPolygon object at 0x10dbfd0>, <shapely.geometry.polygon.Polygon object at 0x7b40110>, <shapely.geometry.multipolygon.MultiPolygon object at 0x7b401d0>, <shapely.geometry.polygon.Polygon object at 0x7b40250>]Original boundary not valid!!!
    nohup_1312/netherlands_osm1.out:Original boundary not valid!!!
    nohup_1312/poland_osm1.out:Original boundary not valid!!!
    nohup_1312/switzerland_osm1.out:Original boundary not valid!!!
    nohup_1312/usa_maryland_osm1.out:Original boundary not valid!!!

    I am not sure how to export these data yet. We would like to use "online backup" i.e. simple possibility how to recompute a region with boundary patch downloaded from OSM server. Sometime is the fix simple, but not always ...

  • Unfortunately given the complexity of the boundary relations and the number of ways involved they are rather fragile and can break relatively easily.

    Could you add the relation numbers for the boundaries that aren't working? Also do you have a more specific reason why they aren't correctly converted?

    I have had a look at a couple of those relations named (Senegal, Poland, Maryland) and  they all seemed to have valid geometries. At least osm2pgsql did convert them correctly into polygons. I haven't checked if the relations have been fixed since the time you reported them, so it might be simply that they are fine now again. But is it however also possible that the mapfactor converter is less robust to some of the mapping inconsistencies where some other tools can still work with the data?

    Altogether boundary data is important for many applications, so ensuring that they are of good quality should be a priority.

    There are a couple of tools that try and help ensure that the boundary relations aren't broken. For example the geofabrik inspector, but that tool itself seems to be broken at the moment.


  • Hi apmon,
    thank you for your comment. I will modify the output to report also the relation number and description of the failure. I may start with the three states you mentioned (Senegal, Poland, Maryland.
    thanks
       Martin
  • boundary of Senegal was OK (2014/01)
  • boundary of Maryland was OK (2014/01)
  • Poland boundary (2014/01) - partial result:
    Not closed polygon:((51163536, 193920734, 324080658L), (51164281, 193920437, 1924534790L))
    ... these are (x,y,nodeID), i.e. that "boundary ends" at http://www.openstreetmap.org/node/324080658
    ... I will closer investigate this.
    thanks
       Martin
    p.s. relation ID = 49715
    Edit: somebody was already editing the boundary, so I did not find nodeID=1924534790 and the first node is in way:
      <way id='28693048' timestamp='2014-01-17T13:11:35Z' uid='145231' user='woodpeck_repair' visible='true' version='15' changeset='20050162'>
        <nd ref='324080658' />
        <nd ref='290833256' />
        <nd ref='290833257' />
        <nd ref='1740733406' />
        <nd ref='1740733405' />
        <tag k='admin_level' v='2' />
        <tag k='border_type' v='nation' />
        <tag k='boundary' v='administrative' />
        <tag k='maritime' v='yes' />
        <tag k='source:boundary' v='WDB' />
      </way>

  • Here is an example of problematic country border (version #26) - New Caledonia:
    http://www.openstreetmap.org/relation/2177258
    as a result it "removes" all the map data.
  • Multipolygon Geometry
    - here is an example of polygon causing "geometric" problems:
    http://www.openstreetmap.org/relation/2656558
    i.e. self-touching boundary
    If I understand it correctly, this is OK from the OSM point of view, so we should do something about it in the processing/conversion phase ... right?
  • >- here is an example of polygon causing "geometric" problems:
    http://www.openstreetmap.org/relation/2656558<

    For me it's not correct. The two houses "18" should be inner of that MP. I'll correct it.

    Edit: done
    (and corrected some other mistakes)
  • Thanks - unfortunately I discovered that verbose description of invalid polygons sometimes kills otherwise reasonably stable computations:
       http://www.openstreetmap.org/relation/1236044
    ... maybe time for tools upgrade.
  • >Thanks - unfortunately I discovered that verbose description of invalid polygons sometimes kills otherwise reasonably stable computations:
       http://www.openstreetmap.org/relation/1236044
    ... maybe time for tools upgrade.<
    Also a bad construction to correct it. (Edit: done)

    And by the way to protect missunderstanding:
    Writing "The two houses '18' should be inner of that MP." above I meant "landuse in landuse" and not "building in landuse".
  • I did not update this thread for a while ... there were two "missing lake/island" reports this morning (Japan and Netherlands) ... one is fixed now, one is not ... if there is anybody willing to help to correct multipolygon errors (or potentially MapFactor interpretation errors in which case we would try to improve the tools) ... I could dump the conversion logs here ... (just let me which country/region).

    Japan example:
    https://www.openstreetmap.org/relation/2344533

    during processing we got something like:

    getWaterShp: processing square 'gaaabbdcdbc'


    ROOT-INTERSECTIONBad water relation/polygon 959541 -


    Self-intersection[495388392 124864855]Bad water relation/polygon 2344533 - 浜名





    Self-intersection[495879187 125225758]Bad water relation/polygon 3502876 -


    Not closed polygon:((496097065, 125158701, 2667070866L), (496093309,
    125158678, 2667070864L))Bad water relation/polygon 3505443 -

    ... the reason is that there is inner ring touching outer ring.

    From http://wiki.openstreetmap.org/wiki/Relation:multipolygon
    "Avoid building multipolygons where an inner ring touches an outer ring though."
    I would guess that it is a mistake in the source, right?
    thanks
       Martin
    p.s. reported coordinates are in milliseconds, i.e. divide by 3600000.0 to get degrees
    (edit) p.s.2 "water" is sometimes misleading ... they are multipolygons in general
  • The Netherlands section :Is that one solved or not?
    If not can you please share the specifics? I can take a look at it
  • the output looks something like this (it was much bigger, so I only cut the beginning that you will get an idea)
    thank you
       Martin

    Self-intersection[20575150.5746991 182940045.203698]Bad water relation/polygon 1206691 -
    Self-intersection[20432545.0288103 183066876.462116]Bad water relation/polygon 1206730 -
    Not closed polygon:((20432543, 183066879, 937696405L), (20432814, 183066540, 2420788559L))Bad water relation/polygon 1206716 - Hoge Fronten
    Self-intersection[20764885.9004429 182981855.030517]Bad water relation/polygon 3481854 -
    Self-intersection[20781268 183570134]Bad water relation/polygon 3380488 -
    Interior is disconnected[20972994 183420516]Bad water relation/polygon 3705866 -
    ROOT-INTERSECTIONBad water relation/polygon 3202846 - Dentgenbach
    Not closed polygon:((21830543, 183222425, 940648515L), (21830285, 183222659, 2703869250L))Bad water relation/polygon 3557608 -
    Self-intersection[21447376.9541498 183537449.921064]Bad water relation/polygon 1592136 -
    ROOT-INTERSECTIONBad water relation/polygon 1592143 -
    Self-intersection[21060720 184044312]Bad water relation/polygon 1586294 -
    Not closed polygon:((21631220, 184218067, 1284555295L), (21631185, 184218167, 1284553369L))Bad water relation/polygon 3833746 -
    Not closed polygon:((21633307, 184214765, 1284553032L), (21630730, 184213747, 1284566765L))Bad water relation/polygon 3833747 -
    Too few points in geometry component[12552192 185028553]Bad water relation/polygon 1146489 -
    ROOT-INTERSECTIONBad water relation/polygon 1148643 -
    ROOT-INTERSECTIONBad water relation/polygon 3832212 -
    ROOT-INTERSECTIONBad water relation/polygon 2940112 -
    Self-intersection[14605706 184838930]Bad water relation/polygon 3430538 -
    Too few points in geometry component[14206858 186394920]Bad water relation/polygon 940363 - Grevelingenmeer
    Self-intersection[14128420.2453638 186228628.204144]Bad water relation/polygon 1919642 -
    Self-intersection[15703888.804262 185121356.667129]Bad water relation/polygon 1159305 -
    Not closed polygon:((14885865, 185134429, 3005598879L), (14880158, 185136741, 892348721L))Bad water relation/polygon 3947486 -
    Self-intersection[16302202 185571802]Bad water relation/polygon 3651351 -
    Self-intersection[15310570.0194282 185871125.137176]Bad water relation/polygon 1408308 - Volkerak
    Not closed polygon:((14869519, 186572379, 2749684042L), (14869542, 186572463, 2737737411L))Bad water relation/polygon 3625270 -
    Self-intersection[15403849 186569383]Bad water relation/polygon 3841320 -
    Self-intersection[16284754 185727515]Bad water relation/polygon 3646887 - Huisartsenpraktijk
    Self-intersection[16415240.9172472 186541213.391001]Bad water relation/polygon 1793472 -

  • I can easily spot some bad (multi)polygons. I will have a look at it.
    (does this also mean that the Dutch map will be delayed?)
  • No - similar reports are on all maps every month. netherlands_osm is in the processing queue now.
  • This month we encounter interesting problem - see (canada_british_columbia_osm):
    https://www.openstreetmap.org/node/3136959394
    The number of links for given junction is in current version of MCA limited, and I am not sure if this is correct way how to map it ... any comment?

  • Very interesting, indeed. They have mapped each lane as separate way. https://www.google.de/maps/place/Vancouver,+Britisch-Kolumbien,+Kanada/@49.0094915,-123.1288831,375m/data=!3m1!1e3!4m2!3m1!1s0x548673f143a94fb3:0xbb9196ea9b81f38b I would have preferred just two ways with the relevant number of lanes. But I am not so deeply involved in mapping, that I know, if OSM standards provide a maximum number of lanes.
  • The mapping of multi-lane roads as such is also a long running discussion within OSM. For motorways it is most common to really make it as two separate roads. For trunks it is not always clear and for lower priority multi-lane roads it is even less clear, especially when it comes to junctions (your post of Zwickauer Strasse is such an example).
    Some say that the lanes should not be separated but it should be one road while using the lanes tag (like lanes=4 with lanes:forward=2), and also with turn:lanes (possibly with forward & backward) 

  • Tsawwassen Terminal has been changed by me in changeset 28424452 as detailed in OSM Note 284905 referencing this thread.

  • OK, thank you - for us was the main problem only the high degree of one particular node (3136959394), so you probably removed too much (?).
  • He did not just remove.  There are three sections with a couple of lanes each. Between those sections are buildrings, green and so on. So he made one way for each section and each way has the number of lanes as exist in reality. I think it's ok now.
  • My colleague generated report from February 2015 OSM conversion with list of errors for each country. Again there can be "false reports" and it is one file for the whole world (5MB). At the moment it is tweaked to parse our processing log files, but if you find some bits which could simplify debugging let us know.
    http://download.mapfactor.com/osm_data_report_2015_02.xml
    thanks
       Martin

  • Hi Martin,

    if we determine that these were errors indeed, will they be visible in the Feb 2015 OSM conversion after they are fixed in OSM? I assume the that's why we have the "Early access maps" to fix issues like this before the final release of the map? ....and if so, where should we report the errors in maps that have already been fixed since the release of the early access maps?
  • cutoff is end of each calendar month - changes made in February will be in March release
    Early maps are approx one week earlier then full release
  • so what's the point of early maps then? if not fixing some known issues before the final maps are released?
  • not sure what do you mean by 'not fixing'
    point is to test before everybody gains access, so I think that we agree
  • so if I fixed some issues (routing problems in OSM) after the early access map releases...will they appear in the final release to the general population in a couple of days?

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion