-
MySQL GIS – Part 3
TweetWhat data is available?
GEO data is expensive to create, so has been created by governments. In the past governments charged for this data. In 1980 the USGS was charging $300 (usd) per county for Oklahoma GEO data. (I complained to my congressman.) Today, a quick Internet search turns up lots of free GIS data.
I was hoping to find a neat collection of basic GEO data. It would be nice if there was one place you could get world political borders (Polygons), postal codes (Polygons) and points of interest like hospitals and airports. What you can find is lots of lists, often collections of odd data created for a virility of complex political purpose. For example, The Global Change Master Directory is a large list of data sources on earth and climate change, but you will not find the data here.
Remember, as you dig for data there are two types Vector (text) and Raster (pictures). Most sites don’t distinguish between them or combine them for you making them less useful your your own uses.
If you find good sources of GEO data PLEASE share them with me so I can share them with everyone else.
Here is a short list of the sites I found and used to create my test data in my series of posts.
www.cloudmade.com – has shape files for the entire world with administrative, natural, coastline, water and points of interest. The data some from Open Street Maps and is available under the Creative Commons Attribution-ShareAlike 2.0 license.
data.geocomm.com -
gisdata.blogspot.com – Public domain GIS data and Free GIS data repositories and clearinghouses.
collinssoftware.com – http://www.collinssoftware.com/freegis_by_region.htm
census.gov/geo/www/tiger – http://www.census.gov/geo/www/tiger/index.html
http://www.hostip.info/dl/index.html
I live in the state of Oklahoma. Because I know it well, I’m using it for my examples. My search for Oklahoma GIS data turned up these sources. This should give you some idea of the data you might find in your searching. I found county borders (polygons), a list of hospitals (points), city borders (polygons), points of interest (points) and zip codes (polygons). Each of these came as shape files so the process was simple to get the data into MySQL.
libremap.org/data/state/oklahoma/ – View the entire list of USGS Oklahoma Digital Raster Graphic Maps
geo.ou.edu – County Boundary, Municipal Boundaries, Voting Precincts, School Districts, State House Districts, State Senate Districts
tin.er.usgs.gov/geology/state/ – Oklahoma geologic map data
okmaps.onenet.net – 25 digital-map data sets, known as the Digital Atlas of Oklahoma
Adding these to my ‘geo’ database was simple. I ran each .shp file through the ogr2ogr program.
wget http://www.okladot.state.ok.us/hqdiv/p-r-div/maps/shp-files/munibnd.zip
unzip munibnd.zip
ogr2ogr -f "MySQL" MySQL:"geo,user=root,host=localhost,password=" -nln oklahoma_cities -lco engine=MYISAM munibnd.shp
More examples on what you can do with GIS data and MySQL.
Viewing our GIS data.
How to collect your own GIS data.
Good and bad examples of searching GIS data.
Optimizing MySQL GIS. Is it really worth using?
Data sources shared by users.
-
dbbenchmark.com – MySQL (basic) connection pool support added
In this latest release I’ve added a basic MySQL connection pool to the benchmarking script which improves the method in which connections to MySQL are handled and reused. In addition, there have been some optimizations made to the thread handler functions for better debug reporting. Download the latest release now and see how your MySQL server performs against the rest of the community! Download here: download page.
-
2011 MySQL Conferences
Next year will mark a significant change for the MySQL community. At least three major conferences will have dedicated MySQL content that is great for attendees getting the best information on how to use MySQL from the experts in the field.
O’Reilly MySQL Conference & Expo
The 9th Annual MySQL conference will be held at is usual home of recent years. Colin will again be back as committee chair for a 3rd year and this will be my 6th straight MySQL conference.
Date: April 11 – 14, 2011
Location: Hyatt Regency, Santa Clara, California
Website: There is no website at this time
Call for Papers: There are no details for call for papers
Program Chairs: Colin Charles from Monty Program AB and Brian Aker.
Collaborate 11
Collaborate is a larger conference (4,000-5,000 attendees) that is actually three separate conferences in one run by the IOUG, OAUG and Quest. The IOUG content is generally a focus for Oracle DBA’s. Last year marked the first year with any MySQL sessions, and this year Collaborate will have dedicated MySQL tracks chaired by fellow ACE director Sheeri Cabral who is well known for her work in the MySQL community.
Date: April 10 – 14, 2011
Location: Orange County Convention Center West, Orlando, Florida
Website: http://collaborate11.ioug.org/
Call for Papers: Now open. Closes Friday October 1, 2010
Program Chair: Sheeri Cabral
KScope 11
ODTUG Kaleidoscope (Kscope for short) is a conference (1500 attendees) that is very focused on delivering the best content from the top community contributors for the communities benefit. 2010 was my first Kaleidoscope conference and I felt completely at home. Great people, great events and the best conference food I’ve had in many years.
With a dedicated MySQL track in 2010 for the first time I will again be the MySQL Program Chair in 2011 with an extended format for the MySQL developer and DBA. The focus will be the best way to develop successful applications with MySQL and will include Architecture, Performance Tuning, Best Practices, Case Studies and Hands-On streams.
Date: June 26 – 30, 2011
Website: http://kscope11.com
Location: Long Beach, California
Call for Papers: Closes Tuesday October 26, 2010
Program Chair: Ronald Bradford – Independent Consultant
Recap
2010 is also not over. MySQL Sunday at OOW promises to be a great event in San Francisco in under 2 weeks. You can still register at a very cheap price of $75 for 4 dedicated tracks of MySQL content. Open SQL Camp being organized also by Sheeri in Boston in October will continue the tradition of a small but focused and free event for the MySQL community.
-
Cloud systems vs. NoSQL email with tons of questions
I had an email after my webinar on NoSQL/SQL for Oracle Development Tools User Group last week (http://www.odtug.com/apex/f?p=500:1:0) from an attendee that was chock full of some questions. I decided to answer them to clarify with this fellow NoSQL and Cloud Systems. I'm pretty happy with my answers. I'd be glad for any thoughts from people about my replies.Here are my responses (the fellow's name is also Patrick):Patrick,You are welcome! Thank you for attending. I put that together a bit hasty but thought it was a good topic to be covering as there are so many organizations that are considering such an architecture.Patrick Francois wrote:> Hi,>> Thank you for the NoSQL Webinar!> Not an easy theme. ..kind of "wide open".> I have recently also tried getting more info on NoSQL and related systems and performance issues. ...and get confused, especially also with the "cloud systems".Yeah, a lot of buzzword-BS and reinventing the wheel going on. They key to remember is that SQL is not going to be replace SQL any time soon, but the combination of SQL and NoSQL can have its advantages.>> I checked for example this document regarding benchmarking "cloud systems":> (maybe you know that document as well)> -> "BenchmarkingCloudServingSystemswithYCSB" > (YCSB -> Yahoo!CloudServingBenchmark)> -> straight link: http://www.brianfrankcooper.net/pubs/ycsb.pdf>> There they speak about : explosion of new systems for datastorage and management "in the cloud"> They mention there all these different NoSQL storage systems,> and say for example also: Some systems are offered only as cloud services, either directly in the case of Amazon SimpleDB[1]Yeah, SimpleDB being Amazon's S3, a proprietary system. You can download it and run it on your own cluster or private cloud.> ....>> That's why I'm also a bit confused about "cloud systems" and "NoSQL".Cloud systems - usually virtual machines that can be easy spun up for elasticity - I want to add more servers to my virtual network on the fly. This can be via either your own VMWare Labmanager setup or something like EC2. There are also real hardware clouds using services such as what Rackspace offers.By "cloud systems" and "NoSQL", it just means running these databases in an environment such as EC2, VMware, or Rackspace, etc... For instance, you can have relational databases in the cloud such as with MySQL, Drizzle, or etc...You can also run them NoSQL databases on your own systems. NoSQL and Cloud systems are not mutually exclusive of each other.> Basically if you speak about cloud systems, you also speak about scaling out same as you are speaking about "scaling out" when speaking about NoSQL.Scaling out can be in the cloud as well as outside the cloud. It just means running applications and databases over a several systems versus using an every increasing big huge server and growing that server to scale as was done in the Good Old Days (TM).>> So, does it imply that NoSQL systems are cloud systems?No, they can be, but are not. NoSQL means a database system that doesn't use SQL to access data, non-relational> How would you see the relation "NoSQL" / "Cloud systems".They complement each other. NoSQL works well in a cloud paradigm.>> --> Even "cloud system" meaning as such is rather unclear.> There was the question about "Cassandra server farm":> If I create an own cassandra server farm, can I then still speak about a "cloud system", or can I speak about a "cloud system" only when underlying servers are also geographically distrubuted?Geographical distribution isn't the determinant in the definition of cloud system. I could have a cloud system down in my cellar if I wanted to if I have multiple real or virtual servers.>> There is also the "security" issue when it comes to "cloud systems" and some say security is bigger because data is distributed. I understand that in that way, you cannot really get hand (or get hacked) on all the data at once, eventually just a part of the data.> But if I think on an own Cassandra farm, where eventually all machines are in the same machine room and network, I can imagine if you can get into one machine, you can get into all of them.This issue requires you to do some homework about how to secure your servers. Set up a good image with everything locked down and use something like puppet to have it come up with all the goodies you set up in that image. Define a list of must-haves before that box (virtual or real) is on the network.>> From the YCSB-document "some systems are offered only as cloud services",> I understand that for using Amazon's SimpleDB, you could not create your own server farm, rather need to buy that service from Amazon?Yeah, you're not going to run your own S3. You can run other NoSQL or distrubuted file systems - pick your choice with a Google Search.>> That is causing me confusion between "NoSQL" and "Cloud systems".NoSQL - as I defined in the presentation - schema less, often non-relational, doesn't use SQL as the Linga Franca of data access.Cloud systems - multiple boxes, real or virtual, elastic, as I mention above.Two different but mutually complementary concepts.> --> Memcached and Membased sounded very interesting. I will check more on those.Please do. And do join the mailing lists. I see Matt Ingenthron and Perry Krug answering emails every day!> --> Can you eventually also provide the slides you used?Certainly - let me make sure they care clean and I'll send them to you!
-
Careful how you monitor MySQL
I was recently struck by a problem which is unusual. In order to keep an eye on the database server I use nagios, cacti, merlin and some local scripts to monitor the database instance and ensure that it is working properly. That normally works fine. The different monitor processes do various things, one of which is to monitor the replication status of a slave, and warn me if the replication is not working or if it’s behind. This is done with the command SHOW SLAVE STATUS.
The server I was looking at runs some large local batch jobs aggregating data. Unfortunately, I was experiencing that replication was interfering with these batch jobs so decided to see if things would perform better if I stopped replication and let the batch jobs complete, restarting replication afterwards.
So the command STOP SLAVE was sent to the server, and this took some time. In the meantime SHOW SLAVE STATUS hangs. The STOP SLAVE command waits for the SQL replication thread to finish it’s task but that thread was waiting on the tables being used by the batch process, which was already running. The consequence of this was that STOP SLAVE waited, all calls to SHOW SLAVE STATUS blocked and with nagios, cacti, merlin and the local scripts all doing periodic SHOW SLAVE STATUS commands which would hang I ended up with the mysqld running out of user connections. Nasty!
While I see if MySQL can do something about the behaviour of stopping the slave and showing the slave status I’m going to implement grants which limit the monitor users so they are only allowed to have a user concurrent connections open. This is done using the syntax
GRANT whatever ON whereever TO some_user WITH MAX_USER_CONNECTIONS 5;
5 looks like a good number for nagios, merlin and the local scripts and 10 is probably sufficient as I collect a lot of graph data for the mysql server and the cacti requests are often done in parallel.
So if you monitor MySQL it may be worth you applying some sort of connection limit on your monitoring scripts, if not on other database users.
|