News:

Thanks to everyone for the feedback on what errors you encountered from the forum database changes made in Fall 2023. Let us know if you discover anymore.

Main Menu

Technical/Design/Implementation Discussions (CHM/Travel Mapping)

Started by Jim, April 04, 2015, 09:50:22 PM

Previous topic - Next topic

SSOWorld

As mentioned, I have an older distro.  It installed Python 3.2.3 when I called for it.  It's still older than yours though. the "flush" function is on a newer minor iteration (3.3?), I just commented it out and the code "worked".  but I agree we should not backport it unless we absolutely have to.
Scott O.

Not all who wander are lost...
Ah, the open skies, wind at my back, warm sun on my... wait, where the hell am I?!
As a matter of fact, I do own the road.
Raise your what?

Wisconsin - out-multiplexing your state since 1918.


Jim

Quote from: SSOWorld on June 24, 2015, 09:07:12 PM
As mentioned, I have an older distro.  It installed Python 3.2.3 when I called for it.  It's still older than yours though. the "flush" function is on a newer minor iteration (3.3?), I just commented it out and the code "worked".  but I agree we should not backport it unless we absolutely have to.

I hope there's not much that would matter in that case.  You're correct that things will work just fine without the flush calls - that's just to be able to see output on the screen before the newline in printed.
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

Jim

This could go here or in either of a few other threads.  It's of most interest to those who'll be maintaining highway data.

My first pass at data checks on the highway data are in place.  CHM had a list of 18 error checks (plus one "undisclosed"), and those are shown at http://cmap.m-plex.com/tools/datacheck.php.  I believe I have code that attempts to check for all except #11, which I haven't quite figured out how to do easily.

The current detected errors are in http://www.teresco.org/~terescoj/travelmapping/logs/datacheck.log.  I am making no attempt to handle false positives yet, and my distances don't have that factor Tim used to improve route length approximations.  It's essential we find a way to account for the thousands of FPs we reported in CHM before we worry about fixing these.  Other than that, things should probably match the errors shown on CHM's list.

Please let me know if you notice anything that seems wrong about the reported errors.
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

rickmastfan67

Quote from: Jim on June 25, 2015, 09:10:38 PM
Please let me know if you notice anything that seems wrong about the reported errors.

Spotted this weird one where the ')' is in the wrong place.

CA CA47 I-710(0 mi)ght not refer to an exit 0

Jim

Quote from: rickmastfan67 on June 25, 2015, 10:19:59 PM
Quote from: Jim on June 25, 2015, 09:10:38 PM
Please let me know if you notice anything that seems wrong about the reported errors.

Spotted this weird one where the ')' is in the wrong place.

CA CA47 I-710(0 mi)ght not refer to an exit 0

Thanks - copy and paste mistake and this and others like it should be gone next time I update.
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

yakra

Quotemy distances don't have that factor Tim used to improve route length approximations.
The exact value of the fudge factor is 1.02112.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

sammi

What exactly does this 'fudge factor' do? What does it stand for?

Rothman

Quote from: sammi on June 25, 2015, 11:35:34 PM
What exactly does this 'fudge factor' do? What does it stand for?

It stands for freedom.

(Sorry, couldn't resist...*ducks out*)
Please note: All comments here represent my own personal opinion and do not reflect the official position(s) of NYSDOT.

yakra

Quote from: sammi on June 25, 2015, 11:35:34 PM
What exactly does this 'fudge factor' do? What does it stand for?
As Jim said, it's used to improve route length approximations.
It dates back to the very early days of Tim's site, back when it was just "Clinched Interstate Mapping". There were no hidden shaping points back then, and due to the route trace directly connecting adjacent interchanges, even when distantly spaced and separated by a meandering roadway, the calculated mileage came up about 2% short on average from the actual route lengths. So, Tim calculated the fudge factor to multiply the routes' length, and more accurately match the total mileage of the interstate system.
A few years back, there was a little (just a little) discussion about recalibrating the fudge factor due to having more accurate route traces with the inclusion of hidden shaping points. But it wasn't viewed as terribly important, and ultimately never happened.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

Jim

Quote from: yakra on June 26, 2015, 11:43:11 AM
Quote from: sammi on June 25, 2015, 11:35:34 PM
What exactly does this 'fudge factor' do? What does it stand for?
As Jim said, it's used to improve route length approximations.
It dates back to the very early days of Tim's site, back when it was just "Clinched Interstate Mapping". There were no hidden shaping points back then, and due to the route trace directly connecting adjacent interchanges, even when distantly spaced and separated by a meandering roadway, the calculated mileage came up about 2% short on average from the actual route lengths. So, Tim calculated the fudge factor to multiply the routes' length, and more accurately match the total mileage of the interstate system.
A few years back, there was a little (just a little) discussion about recalibrating the fudge factor due to having more accurate route traces with the inclusion of hidden shaping points. But it wasn't viewed as terribly important, and ultimately never happened.

For now, I think I should use the CHM factor so we can see unintentional discrepancies between old and new stats.  If I add that factor into a couple places I think we should see matching numbers.

I think it remains a long-term and low-priority goal to come up with something better, like the option of per-route or even per-segment overrides to account more accurately for the lengths of especially straight or especially curvy routes.
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

Jim

Well, what do you know.. the factor was sitting right there in the JS distance functions I copied from Tim years ago.  So route lengths in the draft highway browser should match CHM, and should later match CHM in the mapping overlay viewer once I code up a way to avoid counting segments along concurrencies multiple times.
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

yakra

Quote from: Jim on June 26, 2015, 04:02:58 PMI think it remains a long-term and low-priority goal to come up with something better, like the option of per-route or even per-segment overrides to account more accurately for the lengths of especially straight or especially curvy routes.
If a given segment carries US202, ME4, and ME100, and each route has a different respective scale factor... what then? What is the scale factor of this segment? What is this segment's length?

That, and just the idea of potentially putting that much more work on each collaborator's place plate when drafting a system, makes me skeptical. I just see headaches.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

mapcat

Quote from: yakra on June 27, 2015, 02:26:09 AM
If a given segment carries US202, ME4, and ME100, and each route has a different respective scale factor... what then? What is the scale factor of this segment? What is this segment's length?

That, and just the idea of potentially putting that much more work on each collaborator's place when drafting a system, makes me skeptical. I just see headaches.

Great point.  What's wrong with scrapping the scale factor entirely, then, and instead encouraging more accurate representation of a route's true shape via more hidden shaping points?  The most compelling arguments I've read so far were, essentially, that it would be more work to make the initial file, and that processing the larger files would take longer when drawing the maps.

yakra

Quote from: mapcat on June 28, 2015, 12:48:10 PM
Great point.  What's wrong with scrapping the scale factor entirely, then, and instead encouraging more accurate representation of a route's true shape via more hidden shaping points?  The most compelling arguments I've read so far were, essentially, that it would be more work to make the initial file, and that processing the larger files would take longer when drawing the maps.
Not just drawing the maps, but, quoting Tim from back in 2010:
Quote- Some web scripts in the site necessarily have execution times roughly proportional to the number of points to be searched or plotted. Raising the shaping point count by a factor of 5 while raising the number of supported routes will eventually have a crippling effect.
I'm thinking multiplex detection and its O(n log n) complexity. There may be other similar routines too. (Show Intersecting Highways?)

In the same post, Tim went on to outline the diminishing returns of putting more effort into greater shaping point detail:
Quote- Using 5 times as many points as needed -- mostly points that will never ever be used by anyone - generates only a 1-2% improvement in the highway mileage, which is clearly not worth the effort.

- No one comes to the CHM site for hi-resolution maps, so 5 times the effort doesn't pay off there either.

- The use of a modest number of shaping points in addition to the required points has proven to be a more modest amount of work for a more significant improvement in maps and mileages than a subsequent doubling or more of the number of shaping waypoints. This is why shaping points should be based on an intermediate scale: 5mi/10km on Google.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

yakra

Tim, Sun Dec 06, 2009:
QuoteYes, some of the right-angle turns make big % errors if you skip them, so you should add shaping points there. I pointed out only 2 errors of this sort in the SD set recently, but it's a general point. We weren't this picky when collecting the US highway, but enough of you suggested we bump up the standard, so we try to get the highways a little more accurate now.

The goal I'd like to see is to get lengths of routes correct to within 2% of the correct length, but verifying that goal can be tedious and not worth the effort. So I came up with a more visual goal instead, which Eric mentioned, to approximate it: keep the centerline of the highway within the blue polyline at the 5mi/10km zoom level of Google Maps in the HB. This should make the grid and city-level php maps a little more accurate too where we want the shape correct as well as the total length.

Also note that there is an automatic +2% correction slapped onto all the highway segments and therefore also all the route totals, since in general we underestimate route lengths by our low-res highway data. This correction roughly fixes freeway mileages on average (some are high and some are low, instead of all being low), and I haven't calibrated it any further.

Tim, Wed Dec 09, 2009:
QuoteWell, the 2% accuracy goal (that I wanted but threw out) and the 2% fudge factor are two different things. It's a coincidence that they are the same at the moment. (Actually the fudge factor is 2.112% in my code and should probably be a little lower now that more routes are better traced.)

I do promise not to shrink the width of the blue line in the HB.

If anyone would like to recalibrate the fudge factor, I have some good ideas for doing so, but no motivation to do it myself.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

mapcat

Some great arguments for not being over-detailed universally, sure.  And 2% accuracy is outstanding; nobody would argue with that, or even 4 or 5%, I'm sure.  But on the minority of routes that have dozens of hairpin turns *and* are quite long, what's the harm of being a little more true to the actual shape of the road?  Those examples where the file shows a route being >10% off reality are the ones that seem to warrant re-examining the rules, IMO.

oscar

Quote from: mapcat on June 28, 2015, 05:32:39 PM
Some great arguments for not being over-detailed universally, sure.  And 2% accuracy is outstanding; nobody would argue with that, or even 4 or 5%, I'm sure.  But on the minority of routes that have dozens of hairpin turns *and* are quite long, what's the harm of being a little more true to the actual shape of the road?  Those examples where the file shows a route being >10% off reality are the ones that seem to warrant re-examining the rules, IMO.

Even some of the shorter routes with lots of hairpin curves would bloat out a lot to gain better distance accuracy.  HI 360, for example, would go from two dozen points to hundreds or even thousands. The distance accuracy would improve a lot percentage-wise, but IMO not worth the effort, even though it would add several miles to my Hawaii mileage (too bad I can't get extra credit for having traveled it more than a dozen times).

Tim's guidance, quoted above, allows latitude for adding a small number of "extra" shaping points to improve distance accuracy without too much more work for the route file drafter and/or the server. I took advantage of that for HI 378, which has several largish switchbacks, not all of which needed shaping points under the usual rules but I included them anyway. Nick did same for his draft UT 261 route file (which not only improved accuracy, but also highlights the famously hairy part of the route in the middle),
my Hot Springs and Highways pages, with links to my roads sites:
http://www.alaskaroads.com/home.html

Jim

I know I started this whole discussion with my comment about possibly adding per-route or per-segment overrides to the factor, but I want to emphasize that it would be something well down the road, when we have all the data and tools we could ever want and have everything in place, and we know the efficiency of all of the processes when they've been tested at scale.  I suspect we'll be able to handle more data than the old system did efficiently.  I've been trying to design everything to put as much of the computationally intensive things in the site update program (which I envision running maybe once or twice a day, so it could reasonably take a few hours - right now takes about 5 minutes) and into the DB queries (which so far are instantaneous for all intents and purposes) rather than into web-facing PHP or JS code that would run on demand when a page is accessed.  As more stats are added and more users are added, something's going to get a lot more expensive, but I hope it's rarely going to be anything that makes individual page accesses (with the stats and maps they generate) too expensive.  So I suggest we maintain the CHM guidelines for waypoint density for now, and think about the issue when we know better what the costs would be (in code development time, highway system development time, and compute time).
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

SSOWorld

Quote from: SSOWorld on June 24, 2015, 09:07:12 PM
As mentioned, I have an older distro.  It installed Python 3.2.3 when I called for it.  It's still older than yours though. the "flush" function is on a newer minor iteration (3.3?), I just commented it out and the code "worked".  but I agree we should not backport it unless we absolutely have to.
Not an issue, found another ppa and got the right version installed.
Scott O.

Not all who wander are lost...
Ah, the open skies, wind at my back, warm sun on my... wait, where the hell am I?!
As a matter of fact, I do own the road.
Raise your what?

Wisconsin - out-multiplexing your state since 1918.

SSOWorld

FYI - and if anyone's already started tell me so I stop - I'm starting on the web rendering of yakra's cpp to generate the CHM maps of systems.  Prototyping first, fine tune later.  yakra - please let me know if you have tinkered more with your code since the last check-in to github.  NOTE: instead of pulling data from csvs as yakra did - I'll be using the database.

Will be (obviously) writing it in PHP.

I'll keep you posted to my progress but since it's on my off time, don't expect results this week ;)  :awesomeface:
Scott O.

Not all who wander are lost...
Ah, the open skies, wind at my back, warm sun on my... wait, where the hell am I?!
As a matter of fact, I do own the road.
Raise your what?

Wisconsin - out-multiplexing your state since 1918.

yakra

I haven't changed the code since then. It's not really something I'm actively working on anymore.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

Jim

Quote from: SSOWorld on June 29, 2015, 10:02:28 AM
FYI - and if anyone's already started tell me so I stop - I'm starting on the web rendering of yakra's cpp to generate the CHM maps of systems.  Prototyping first, fine tune later.  yakra - please let me know if you have tinkered more with your code since the last check-in to github.  NOTE: instead of pulling data from csvs as yakra did - I'll be using the database.

Will be (obviously) writing it in PHP.

I'll keep you posted to my progress but since it's on my off time, don't expect results this week ;)  :awesomeface:

Excellent!  If you need anything different in the DB to support this, let me know.  Have you been able to populate your own copy of the DB to experiment with?
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

SSOWorld

Quote from: Jim on June 29, 2015, 02:29:54 PM
Quote from: SSOWorld on June 29, 2015, 10:02:28 AM
FYI - and if anyone's already started tell me so I stop - I'm starting on the web rendering of yakra's cpp to generate the CHM maps of systems.  Prototyping first, fine tune later.  yakra - please let me know if you have tinkered more with your code since the last check-in to github.  NOTE: instead of pulling data from csvs as yakra did - I'll be using the database.

Will be (obviously) writing it in PHP.

I'll keep you posted to my progress but since it's on my off time, don't expect results this week ;)  :awesomeface:

Excellent!  If you need anything different in the DB to support this, let me know.  Have you been able to populate your own copy of the DB to experiment with?

Jim - yes I have.  I'll keep you posted

yakra - thanks for the heads up.
Scott O.

Not all who wander are lost...
Ah, the open skies, wind at my back, warm sun on my... wait, where the hell am I?!
As a matter of fact, I do own the road.
Raise your what?

Wisconsin - out-multiplexing your state since 1918.

yakra

Quote from: english si on June 13, 2015, 01:00:52 PM
Quote from: Jim on June 13, 2015, 12:06:56 PMMainly, what I want to suggest is that if there's no good reason to have to have points off by a small amount, we should have them in the exact same point.
The only reasons to make intersecting points exactly identical are:
1) anal retentiveness
2) making sure there's a concurrency
3) making it easier to make files by starting with points that intersect that route

The first is literally "no good reason", though the second is (but only applies to specific bits). The third is a time saver (negated by the hassle of keeping points identical - if you move one point to be more accurate because it was slightly off, then you have to do the same in every other route that meets there).

NMP is not a true error (unless a concurrency that should be isn't registered) that causes issues with end users and statistics and so on. But sure, it's a falling short from perfection and ought to be flagged as a potential error.

And certainly, while you have a script for it, I don't want files merged to the average location of the point. All the GB ones come from when, either through peer review or update, I've got a more pinpointed location, and simply not done it on other routes. Averaging out would take a 0m wrong point and a 10m wrong point and make 2x 5m wrong points. If I sort out those NMPs, it will be manually and with the priority of the snipe hunt that it is.
Whoops. I left a drafted response here on Desktop #1 for weeks now...

Si, I'm glad to see you NMPs as a falling short from perfection [which] ought to be flagged as a potential error.
I always regarded it as best practice to ensure intersecting routes had identical coordinates, worth the anal retentiveness and a little extra hassle.
True, with Tim's existing algorithms, close-enough points still flag an intersecting route; no-harm-no-foul.

One justification I see is future-proofing routes just in case a concurrency comes along in the future.

A real-world example:
ME US1 ME196 @ 43.919319°, -69.953256°
ME ME196 US1 @ 43.919319°, -69.953255°
I just checked this a few minutes ago, and [Popeye] Well blow me down! [/Popeye] -- the points actually have different coords. A perfect example.
(Looks like a case of that mysterious "point drift" that plagued files in the early years. But anyway...)
Intersecting highways are shown and linked just as intended.
BUT! In October 2014 (after the last hwy data update) ME24 was relocated, to follow US1, ME196, and Bypass Dr.
My local, updated ME ME24 has US1_S @ 43.919319°, -69.953256°.
If my local copies of all three files are submitted & ingested into the DB & yadda yadda, the US1/ME24 concurrency will register, but ME24/196 will not.

Having point coords match up from the get-go will avoid having to go back later when a new concurrency comes along and adjust point coords on intersecting routes.
"Officer, I'm always careful to drive the speed limit no matter where I am and that's what I was doin'." Said "No, you weren't," she said, "Yes, I was." He said, "Madam, I just clocked you at 22 MPH," and she said "That's the speed limit," he said "No ma'am, that's the route numbah!"  - Gary Crocker

english si

Quote from: yakra on July 03, 2015, 01:56:52 PMHaving point coords match up from the get-go will avoid having to go back later when a new concurrency comes along and adjust point coords on intersecting routes.
Sure, but when some nasty peer-reviewer called Eric comes and asks me to move a point a tiny bit, do I move it on all intersecting routes (often ones that have been peer reviewed and perhaps are active) that were identical or do I simply wait for a concurrency to require the adjustment? ;)

It's clear you feel that moving waypoints so that the concurrency works is bad, so (to play devil's advocate) why should I do it when there's no concurrency to require it?

Also, I had one the other day where the points were identical coordinates (I tripled checked) in Tim's .wpt editor, but they weren't giving the DC error, and I could see they were different. Bizzareness in Arizona (making Historic 66 file, given it is signed)!



Opinions expressed here on belong solely to the poster and do not represent or reflect the opinions or beliefs of AARoads, its creators and/or associates.