Thursday, June 15, 2006

Oracle Spatial + Where 2.0?

Ok...

So I just got back from the Where 2.0 conference. All in all it was a pretty good show. I have my issues with some of the talks, but overall it was really good. The question I have is. Where was Oracle?

Don't you think that Oracle Spatial should have been there?

Maybe I missed them, but they were not presenting and they were not in the participant list.

Intel, Google, Yahoo, Microsoft, Autodesk, ESRI, MapInfo, Ask, AOL, Cisco, Motorola, MapQuest, eBay, Apple, and many others were there.

Some presenters really talked about Oracle Spatial though...

Jeremy

Tuesday, April 11, 2006

Consuming REST services with Oracle

I love REST services (representation state transfer). They provide a powerful yet simple way of consuming data via the web (Some examples of REST services: Yahoo, some ArcWeb Services, Mapdex API, Flickr, even WMS services). Here is an example of one I made up.

I want to be able to query the National Elevation Dataset (via http://seamless.usgs.gov ) to find out what the elevation is at a particular latitude, longitude coordinate. I want to submit a geographic point and extract the elevation value at the point. USGS is publishing this as an ArcIMS image service (USGS_EDC_Elev_NED). I can use the get_raster_info request to extract the elevation at a particular point.

Here is my test REST service:


http://hercules.kgs.ku.edu/kgs/web_services/NED_REST/seamless_ned_elevation.cfm?
latitude=38&longitude=-115
&mapserver=seamless.usgs.gov
&mapservice=USGS_EDC_Elev_NED
&layer_id=NED.CONUS_NED


You can view the source code here.

It queries the service and reformats the ArcIMS response to the following XML output:


<NED_ELEVATION>
<ELEVATION_LOCATION latitude="38" longitude="-115">
<ELEVATION_METERS>1627.53918457031</ELEVATION_METERS>
<ELEVATION_FEET>5339.95606458</ELEVATION_FEET>
</ELEVATION_LOCATION>
</NED_ELEVATION>


In order to access this document in Oracle we need to take advantage of two features. One is the utl_http package (see AskTom or the Oracle docs for more info) to request an html page via a url.


This query will actually return the text/xml response from the server as a 4000 character string to the sqlplus window (use request_pieces for larger files).

We then have to turn this character representation into an xml document so we can query it.

To do this we can use the xmltype datatype (an Oracle system defined datatype) that allows us to parse an xml document and query it via xpath.

Since my REST service returns an xml document all we have to do is store the xml document as an xmltype datatype and use xpath to query the xml document to retrieve the relevant values.

select a.xmldataout.extract
('//KANSAS_ELEVATION/ELEVATION_LOCATION/ELEVATION_FEET/text()').getNumberVal()
as elevation_feet,
a.xmldataout.extract
('//KANSAS_ELEVATION/ELEVATION_LOCATION/ELEVATION_METERS/text()').getNumberVal()
as elevation_meters
from (select
xmltype(UTL_HTTP.REQUEST
('http://geoportal.kgs.ku.edu/kgs/web_services/kansas_elevation/
ks_ned_elevation_datum.cfm?latitude=39.1&longitude=-99.6&datum=27'
)
) xmldataout from dual) a
where a.xmldataout.extract('//KANSAS_ELEVATION/ERROR/text()').getStringVal() is null

Which returns...
ELEVATION_FEET ELEVATION_METERS
-------------- ----------------
2886.25208 879.686707

Do you see the power? I can do this for single point or as an update to an entire table.

So now I can map the elevation along my run route...



Tuesday, March 28, 2006

Database + GIS = Powerful Visualization

I love GIS. Here are two flat Oracle tables one describing the top values (depth to a particular formation) for all the wells in Kansas the other summarizing the amount of production by section (PLSS square mile), by year, by type (oil or gas), and by formation.

SQL> desc STRAT_WELL_TOP_22MAR2006
Name Null? Type
---------------------------------- -------- ------------------
FORMATION_NAME NOT NULL VARCHAR2(60)
WELL_HEADER_KID NOT NULL NUMBER(10)
WELL_TOP NUMBER(8,2)
FIELD_KID NUMBER(10)
LATITUDE NUMBER(11,6)
LONGITUDE NUMBER(11,6)
GROUND_ELEVATION VARCHAR2(40)
GROUND_ELEVATION_SOURCE VARCHAR2(12)
WELL_TOP_SEALEVEL_NON_NED VARCHAR2(40)
WELL_TOP_SEALEVEL NUMBER

SQL> desc ACRES_640_PRODUCTION_FORMATION
Name Null? Type
---------------------------------------- -------- -------------
RECNMBR NOT NULL NUMBER(11)
STRAT_UNIT_KID NOT NULL NUMBER(10)
YEAR NOT NULL NUMBER(4)
PRODUCT NOT NULL VARCHAR2(1)
PRODUCTION NUMBER(12,2)
The goal is to produce a multipatch polygon between two formations colored by the cummulative amount of oil and gas production between the formaitons. Today I built a geoprocessing model to accomplish this task. Here are the steps I need to...

  1. Select records for two distinct formations,
  2. Generate event themes based on the x,y coordinates of the records in step 1,
  3. Generate two distinct TIN representations of the x,y,z values,
  4. Create a view that summarizes the production between the two formations by section and join that datasaet to a spatial layer depicting the sections,
  5. Use the extrude between process to generate a multi-patch polygon using the two tin datasets as the top and bottom surfaces clipped by the section production.
Anyway enough talk...Here is the model:













And here is the output...



The cool thing about the model is that I can rerun it based on any input query (set of formations). Build once, run everywhere.

Cheers,

Jeremy

Monday, March 20, 2006

SQL Interpolation Techniques

First off I have just returned from the ESRI developers summit and I had a great time. It was so nice to meet all the great developers out there. ESRI did a great job in organizing it. If you ever get a chance to meet Brian Flood take advantage of it. Brian is an amazing developer and I really enjoyed our coversations. Brian gets it...

I file this one under, stop me before I try to do too much with SQL! I wrote this a couple of years ago when I wanted have an interpolation routine that was straight SQL (No Oracle Spatial and No SDE). I would not do this technique now (ESRI Spatial Type for Oracle or straight to ArcGIS Server for my interpolations) but it illustrates some good Analytical Function practices. Enjoy...

ORIGINAL QUESTION:

I have a table that represents one mile polygons (sections) that has locational
information, center latitude and center longitude. I have another table that is
made up of point data that has for every point the latitude, longitude, and some
value (lets say it is a water level depth). What I want to do is build a view
that has the section polygon ID and an average water level value for the closest
10 wells that are at least within 10 square miles. This involves computing the
distance of all wells within a 10 square mile of the polygon cell, sorting those
wells to find the closest 10, and then averaging the results (actually I would
perform some kind of inverse distance calculation, but for simplicity sake I
would like to see how you would efficiently select and average the 10 wells
values).


I HAVE PASTED MY ATTEMPT BELOW...

HERE IS INFORMATION ABOUT THE TWO TABLES THAT QUERY USES.


SQL> DESC ACRES640_XYCENTER
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJECTID NOT NULL NUMBER(38)
TRS VARCHAR2(10)
RECNMBR NUMBER(11)
CENTER_X NUMBER(23,11)
CENTER_Y NUMBER(23,11)
SHAPE NUMBER(38)

SQL> select trs, recnmbr, center_x, center_y from ACRES640_XYCENTER where rownum < 10;

TRS RECNMBR CENTER_X CENTER_Y
---------- ---------- ---------- ----------
02S22W08 2816 -133815 431037
02S05E07 2817 114097 430710
02S12E07 2818 181440 431992
02S03W08 2819 48505.5 430018
02S23W12 2820 -136969 431089
02S05W08 2821 29424.6992 429929
02S04E12 2822 112494 430684
02S25W10 2823 -159307 431515
02S10E07 2824 162297 431573

9 rows selected.


SQL> SELECT COUNT(*) FROM ACRES640_XYCENTER;

COUNT(*)
----------
81803


SQL> desc WELLHEADERS_25MAY2004_PRJ
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJECTID NOT NULL NUMBER(38)
KID NUMBER(10)
LATITUDE NUMBER(17,6)
LONGITUDE NUMBER(17,6)
X_COORD NUMBER(23,11)
Y_COORD NUMBER(23,11)
SHAPE NUMBER(38)



SQL> select kid, latitude, longitude, x_coord, y_coord from WELLHEADERS_25MAY2004_PRJ;

KID LATITUDE LONGITUDE X_COORD Y_COORD
---------- ---------- ---------- ---------- ----------
1027937589 37.00094 -95.77342 219360 113565
1027937587 37.00094 -95.77342 219360 113565
1027937585 37.00094 -95.77342 219360 113565
1027937583 37.00094 -95.77342 219360 113565
1027937567 37.00094 -95.77342 219360 113565
1027937237 37.00094 -95.77342 219360 113565
1027937235 37.00094 -95.77342 219360 113565
1027935075 37.00094 -95.77342 219360 113565
1027803979 37.00094 -95.77342 219360 113565
1027936869 37.00094 -95.77342 219360 113565
1027936867 37.00094 -95.77342 219360 113565
1027936865 37.00094 -95.77342 219360 113565
1027936805 37.00094 -95.77342 219360 113565
1027937057 37.00094 -95.77342 219360 113565
1027934581 37.00094 -95.77342 219360 113565
1027934579 37.00094 -95.77342 219360 113565
1027934577 37.00094 -95.77342 219360 113565
1027934575 37.00094 -95.77342 219360 113565
1027934573 37.00094 -95.77342 219360 113565
1027934571 37.00094 -95.77342 219360 113565
1027934569 37.00094 -95.77342 219360 113565
1027804273 37.00094 -95.77342 219360 113565
1027804269 37.00094 -95.77342 219360 113565
1006790617 37.00133 -96.08868 191441 112895
1002895443 37.00133 -96.21841 179952 112630
1006790402 37.00133 -96.21841 179952 112630
1001257052 37.00132 -101.97293 -329700 117372
1006094323 37.00132 -101.97293 -329700 117372
1006094606 37.00132 -101.97746 -330101 117389
1020067001 37.00132 -101.97746 -330101 117389
1005449837 37.00132 -101.97746 -330101 117389
1026090216 37.0013 -100.57895 -206286 113259
1006093808 37.0013 -100.57895 -206286 113259
1026525325 37.0013 -100.57895 -206286 113259
1027805483 37.00128 -95.84634 212901 113429
1027937579 37.00128 -95.84634 212901 113429
1027807657 37.00128 -95.84634 212901 113429
1027807573 37.00128 -95.84634 212901 113429
1027807185 37.00128 -95.84634 212901 113429
1027703233 37.00128 -98.83632 -51938.301 110781
1027796075 37.00128 -98.83632 -51938.301 110781
1002901280 37.00127 -97.07416 104158 111286
1008415306 37.00127 -101.51646 -289294 115812
1027807735 37.00122 -95.82817 214511 113465
1027805485 37.00122 -95.82817 214511 113465
1027807499 37.00122 -95.82817 214511 113465
1027806799 37.00122 -95.82817 214511 113465
1027806797 37.00122 -95.82817 214511 113465
1027806517 37.00122 -95.82817 214511 113465
1027805877 37.00122 -95.82817 214511 113465
1006143725 37.00122 -100.71236 -218100 113562
1002932550 37.0012 -95.81286 215866 113499
1006790987 37.00119 -95.80991 216128 113505
1005456974 37.00119 -95.80991 216128 113505
1027805555 37.00119 -95.80992 216127 113505
1006790391 37.00111 -96.19579 181955 112651
1002895619 37.00111 -96.41649 162409 112233
1006790787 37.0011 -95.9109 207185 113260
1027805141 37.00109 -95.79157 217752 113538
1027805077 37.00109 -95.79157 217752 113538
1027931973 37.00109 -95.79157 217752 113538
1027931971 37.00109 -95.79157 217752 113538
1027806551 37.00109 -95.79157 217752 113538
1004751365 37.00159 -101.81951 -316120 116857
1002876182 37.00159 -101.82404 -316521 116873
1002895677 37.00158 -96.26826 175536 112560
1002952960 37.00158 -97.65121 53042.6992 110821

68 rows selected.



SQL> SELECT COUNT(*) FROM WELLHEADERS_25MAY2004_PRJ;

COUNT(*)
----------
375571


SQL>
SQL> COLUMN COLUMN_NAME FORMAT A15 TRUNC
SQL> ;
1 SELECT INDEX_NAME, COLUMN_NAME, TABLE_NAME FROM USER_IND_COLUMNS
2* WHERE TABLE_NAME IN ('WELLHEADERS_25MAY2004_PRJ','ACRES640_XYCENTER')
SQL> /

INDEX_NAME COLUMN_NAME TABLE_NAME
------------------------------ --------------- ------------------------------
A1411_IX1 SHAPE ACRES640_XYCENTER
R2067_SDE_ROWID_UK OBJECTID ACRES640_XYCENTER
ACRES640_XYCENTER_RECNMBR RECNMBR ACRES640_XYCENTER
ACRES640_XYCENTER_X_YCENT CENTER_X ACRES640_XYCENTER
ACRES640_XYCENTER_X_YCENT CENTER_Y ACRES640_XYCENTER
ACRES640_XYCENTER_CENTX CENTER_X ACRES640_XYCENTER
ACRES640_XYCENTER_CENTY CENTER_Y ACRES640_XYCENTER
A1409_IX1 SHAPE WELLHEADERS_25MAY2004_PRJ
R2065_SDE_ROWID_UK OBJECTID WELLHEADERS_25MAY2004_PRJ
WELLHEADERS_25_MAY2004KID KID WELLHEADERS_25MAY2004_PRJ
WELLHEADERS_25_MAY2004_LATLONG Y_COORD WELLHEADERS_25MAY2004_PRJ
WELLHEADERS_25_MAY2004_LATLONG X_COORD WELLHEADERS_25MAY2004_PRJ
WELLHEADERS_25_MAY2004_YCOORD Y_COORD WELLHEADERS_25MAY2004_PRJ
WELLHEADERS_25_MAY2004_XCOORD X_COORD WELLHEADERS_25MAY2004_PRJ

14 rows selected.




HERE IS THE QUERY THAT I CAME UP WITH....



FIRST I WANT TO SELECT THE SECTION ID, WELL ID, THE DISTANCE FROM THE WELL TO THE
CENTER OF THE SECTION, & THE QUADRANT OF THE SECTION (NW,NE,SW,SE) THAT THE WELL
IS IN ALL WITHIN A 7200 X 7200 METER BOUDING BOX (CENTERED ON THE SECTION CENTER)
FOR EVERY SECTION IN MY STUDY AREA.

select a.recnmbr recnmbr, b.kid kid,
case
when b.x_coord < a.center_x and b.Y_COORD < a.center_y then 'SW'
when b.x_coord < a.center_x and b.Y_COORD > a.center_y then 'NW'
when b.x_coord > a.center_x and b.Y_COORD < a.center_y then 'SE'
when b.x_coord > a.center_x and b.Y_COORD > a.center_y then 'NE'
end QUADRANT,
sqrt( ( (b.x_coord - a.center_x) * (b.x_coord - a.center_x) ) +
( (b.Y_COORD - a.center_y) * (b.Y_COORD - a.center_y) ) ) distance_to_cell
from ACRES640_XYCENTER a, WELLHEADERS_25MAY2004_PRJ b
where b.y_coord between a.center_y - 3600 and a.center_y + 3600
and b.x_coord between a.center_x - 3600 and a.center_x + 3600
and a.center_x >= -320688
and a.center_x <= -284100
and a.center_y >= 169518
and a.center_y <= 242855




RECNMBR KID QU DISTANCE_TO_CELL
---------- ---------- -- ----------------
49950 1006068633 SW 4129.70144
49950 1006068137 SW 3580.94959
49950 1002919963 SW 4519.08055
49950 1006068032 SE 3446.22605
49950 1006068206 SW 2316.18501
49950 1022362561 SW 3422.35124
49950 1006068139 SW 1594.49365
49950 1001235397 SW 3128.24951
49950 1022009618 SW 2460.49934
49950 1006068479 SW 1646.93169
49950 1006068208 SW 2039.86985
49950 1026642145 NW 2022.24281
49950 1025778509 NW 2062.54042
49950 1025686933 NW 3158.56803
49950 1022362563 NW 3747.38909
49950 1006068200 NW 3791.15352
49950 1022009617 NW 2596.05951
49950 1020066904 NW 3337.37142
49950 1006068395 NW 4568.36393
49952 1006068633 SW 3628.02563
49952 1006068437 SW 4142.8273
49952 1006068137 SW 3201.38033
49952 1002919963 SW 3479.8352
49952 1008378808 SW 4545.26149
49952 1002920004 SW 4531.73697
49952 1008764319 SW 3459.29299
49952 1006068206 SE 2021.65502
49952 1022362561 SW 2330.793
49952 1006067823 SW 2810.31617
49952 1002919902 SW 3859.54661
49952 1006068139 SE 2240.00223
49952 1001235397 SW 1855.94854
49952 1022009618 SW 1463.18591
49952 1006068585 SW 2491.16539
49952 1006068479 SE 1237.73503
49952 1006068381 SW 3055.63512
49952 1027708907 SW 3536.07494
49952 1006067716 SW 2673.85303
49952 1006068208 SW 558.660004
49952 1006067854 SW 2193.10054
49952 1026642145 NW 435.910541
49952 1025778509 NW 603.927976
49952 1002919979 NW 2058.37266
49952 1025686933 NW 1719.64764
49952 1022362563 NW 2248.00022
49952 1006068200 NW 2332.77196
49952 1006068683 NW 3040.36527
49952 1006067983 NW 3812.09299
49952 1022009617 NW 1515.13201
49952 1020066904 NW 2175.32802

THEN I USE THE ANALYTICAL FUNCTION DENSE_RANK TO ASSIGN A RANKING VALUE FOR
THE DISTANCE BY SECTION NUMBER (RECNMBR) TO EACH WELL

select recnmbr, kid, quadrant, distance_to_cell,
dense_rank()
over (partition by recnmbr order by distance_to_cell ) TOPN_TOTAL
from (
select a.recnmbr recnmbr, b.kid kid,
case
when b.x_coord < a.center_x and b.Y_COORD < a.center_y then 'SW'
when b.x_coord < a.center_x and b.Y_COORD > a.center_y then 'NW'
when b.x_coord > a.center_x and b.Y_COORD < a.center_y then 'SE'
when b.x_coord > a.center_x and b.Y_COORD > a.center_y then 'NE'
end QUADRANT,
sqrt( ( (b.x_coord - a.center_x) * (b.x_coord - a.center_x) ) +
( (b.Y_COORD - a.center_y) * (b.Y_COORD - a.center_y) ) ) distance_to_cell
from ACRES640_XYCENTER a, WELLHEADERS_25MAY2004_PRJ b
where b.y_coord between a.center_y - 3600 and a.center_y + 3600
and b.x_coord between a.center_x - 3600 and a.center_x + 3600
and a.center_x >= -320688
and a.center_x <= -284100
and a.center_y >= 169518
and a.center_y <= 242855
)




RECNMBR KID QU DISTANCE_TO_CELL TOPN_TOTAL
---------- ---------- -- ---------------- ----------
49948 1006068139 SW 2280.82989 1
49948 1006068032 SE 2320.57859 2
49948 1006068479 SW 3008.3173 3
49948 1006068206 SW 3434.70581 4
49948 1006068198 SE 3514.10543 5
49948 1006067819 NE 4519.37982 6
49948 1006068137 SW 4533.81749 7
49948 1006068188 SE 4542.85417 8
49950 1006068139 SW 1594.49365 1
49950 1006068479 SW 1646.93169 2
49950 1026642145 NW 2022.24281 3
49950 1006068208 SW 2039.86985 4
49950 1025778509 NW 2062.54042 5
49950 1006068206 SW 2316.18501 6
49950 1022009618 SW 2460.49934 7
49950 1022009617 NW 2596.05951 8
49950 1001235397 SW 3128.24951 9
49950 1025686933 NW 3158.56803 10
49950 1020066904 NW 3337.37142 11
49950 1022362561 SW 3422.35124 12
49950 1006068032 SE 3446.22605 13
49950 1006068137 SW 3580.94959 14
49950 1022362563 NW 3747.38909 15
49950 1006068200 NW 3791.15352 16
49950 1006068633 SW 4129.70144 17
49950 1002919963 SW 4519.08055 18
49950 1006068395 NW 4568.36393 19
49952 1026642145 NW 435.910541 1
49952 1006068208 SW 558.660004 2
49952 1025778509 NW 603.927976 3
49952 1006068479 SE 1237.73503 4
49952 1022009618 SW 1463.18591 5
49952 1022009617 NW 1515.13201 6
49952 1025686933 NW 1719.64764 7
49952 1001235397 SW 1855.94854 8
49952 1006068206 SE 2021.65502 9
49952 1002919979 NW 2058.37266 10
49952 1020066904 NW 2175.32802 11
49952 1006067854 SW 2193.10054 12
49952 1006068139 SE 2240.00223 13
49952 1022362563 NW 2248.00022 14
49952 1022362561 SW 2330.793 15
49952 1006068200 NW 2332.77196 16
49952 1006068585 SW 2491.16539 17
49952 1006067716 SW 2673.85303 18
49952 1006067823 SW 2810.31617 19
49952 1006068683 NW 3040.36527 20
49952 1006068381 SW 3055.63512 21
49952 1006068137 SW 3201.38033 22
49952 1006068395 NW 3454.67292 23
49952 1008764319 SW 3459.29299 24
49952 1002919963 SW 3479.8352 25
49952 1027708907 SW 3536.07494 26
49952 1006068633 SW 3628.02563 27
49952 1006067983 NW 3812.09299 28
49952 1002919902 SW 3859.54661 29
49952 1013358575 NW 3955.73204 30
49952 1006068437 SW 4142.8273 31
49952 1002920004 SW 4531.73697 32
49952 1008378808 SW 4545.26149 33


NEXT WE SELECT THE 8 CLOSEST WELLS FOR EACH SECTION, WE ALSO NEED TO COMPUTE
THE COUNT OF THE DISTINCT QUADRANTS. WE NEED THIS BECAUSE SOME INTERPOLATION
TECHNIQUES WORK BETTER IF YOU FORCE THE INTERPOLATED VALUE TO HAVE AT LEAST X
NUMBER OF POINTS IN EACH QUADRANT. IN THIS CASE WE ARE GOING TO WANT TO HAVE
AT LEAST ONE WELL IN EACH QUADRANT.

select recnmbr, kid, distance_to_cell, QUADRANT, TOPN_TOTAL,
count(distinct QUADRANT) over (partition by recnmbr) QUAD_COUNT
from (
select recnmbr, kid, distance_to_cell, QUADRANT, TOPN_TOTAL
from (
select recnmbr, kid, quadrant, distance_to_cell,
dense_rank()
over (partition by recnmbr order by distance_to_cell ) TOPN_TOTAL
from (
select a.recnmbr recnmbr, b.kid kid,
case
when b.x_coord < a.center_x and b.Y_COORD < a.center_y then 'SW'
when b.x_coord < a.center_x and b.Y_COORD > a.center_y then 'NW'
when b.x_coord > a.center_x and b.Y_COORD < a.center_y then 'SE'
when b.x_coord > a.center_x and b.Y_COORD > a.center_y then 'NE'
end QUADRANT,
sqrt( ( (b.x_coord - a.center_x) * (b.x_coord - a.center_x) ) +
( (b.Y_COORD - a.center_y) * (b.Y_COORD - a.center_y) ) ) distance_to_cell
from ACRES640_XYCENTER a, WELLHEADERS_25MAY2004_PRJ b
where b.y_coord between a.center_y - 3600 and a.center_y + 3600
and b.x_coord between a.center_x - 3600 and a.center_x + 3600
and a.center_x >= -320688
and a.center_x <= -284100
and a.center_y >= 169518
and a.center_y <= 242855
)
)
where TOPN_TOTAL <= 8
)


RECNMBR KID DISTANCE_TO_CELL QU TOPN_TOTAL QUAD_COUNT
---------- ---------- ---------------- -- ---------- ----------
49934 1006068645 1800.1 NE 1 3
49934 1006068218 2281.36122 NE 3 3
49934 1006068541 3524.63587 NW 6 3
49934 1006068571 3722.9722 NW 7 3
49934 1006068190 1808.89607 SW 2 3
49934 1006068182 2839.38479 SW 4 3
49934 1006068198 3833.55879 SW 8 3
49934 1006068545 2948.63036 SW 5 3
49935 1006068645 513.778162 NE 1 4
49935 1006079840 4155.58817 NE 7 4
49935 1006068218 1812.26185 NW 2 4
49935 1006068541 4270.07283 NW 8 4
49935 1027709052 3397.06712 SE 3 4
49935 1023930739 3397.06712 SE 3 4
49935 1006068190 3402.91713 SW 4 4
49935 1006068545 3581.14074 SW 5 4
49935 1006068182 4043.25834 SW 6 4
49937 1001243091 2834.38124 NE 1 4
49937 1006344036 3447.8283 NE 6 4
49937 1006068645 3079.22019 NW 3 4
49937 1006079840 3140.74704 NW 4 4
49937 1006068218 3822.73018 NW 8 4
49937 1006078290 3037.51165 SE 2 4
49937 1006078651 3798.06859 SE 7 4
49937 1027709052 3203.50433 SW 5 4
49937 1023930739 3203.50433 SW 5 4
49938 1006079840 3328.76974 NE 4 3
49938 1006344036 4377.68249 NE 7 3
49938 1001243091 3437.66912 NE 5 3
49938 1006068645 1535.64091 NW 1 3
49938 1006068218 2537.84357 NW 2 3
49938 1027709052 2893.72839 SE 3 3
49938 1006078290 3944.43317 SE 6 3
49938 1023930739 2893.72839 SE 3 3


FINALLY WE SELECT ALL SECTIONS THAT HAVE A DISTINCT QUAD COUNT OF 4.
THE 8 CLOSEST WELLS THAT HAVE AT LEAST ONE WELL IN EACH QUADRANT.

select recnmbr, kid, distance_to_cell, QUADRANT, TOPN_TOTAL
from (
select recnmbr, kid, distance_to_cell, QUADRANT, TOPN_TOTAL,
count(distinct QUADRANT) over (partition by recnmbr) QUAD_COUNT
from (
select recnmbr, kid, distance_to_cell, QUADRANT, TOPN_TOTAL
from (
select recnmbr, kid, quadrant, distance_to_cell,
dense_rank()
over (partition by recnmbr order by distance_to_cell ) TOPN_TOTAL
from (
select a.recnmbr recnmbr, b.kid kid,
case
when b.x_coord < a.center_x and b.Y_COORD < a.center_y then 'SW'
when b.x_coord < a.center_x and b.Y_COORD > a.center_y then 'NW'
when b.x_coord > a.center_x and b.Y_COORD < a.center_y then 'SE'
when b.x_coord > a.center_x and b.Y_COORD > a.center_y then 'NE'
end QUADRANT,
sqrt( ( (b.x_coord - a.center_x) * (b.x_coord - a.center_x) ) +
( (b.Y_COORD - a.center_y) * (b.Y_COORD - a.center_y) ) ) distance_to_cell
from ACRES640_XYCENTER a, WELLHEADERS_25MAY2004_PRJ b
where b.y_coord between a.center_y - 3600 and a.center_y + 3600
and b.x_coord between a.center_x - 3600 and a.center_x + 3600
and a.center_x >= -320688
and a.center_x <= -284100
and a.center_y >= 169518
and a.center_y <= 242855
)
)
where TOPN_TOTAL <= 8
)
)
where QUAD_COUNT = 4

RECNMBR KID DISTANCE_TO_CELL QU TOPN_TOTAL
---------- ---------- ---------------- -- ----------
49935 1006068645 513.778162 NE 1
49935 1006079840 4155.58817 NE 7
49935 1006068218 1812.26185 NW 2
49935 1006068541 4270.07283 NW 8
49935 1027709052 3397.06712 SE 3
49935 1023930739 3397.06712 SE 3
49935 1006068190 3402.91713 SW 4
49935 1006068545 3581.14074 SW 5
49935 1006068182 4043.25834 SW 6
49937 1001243091 2834.38124 NE 1
49937 1006344036 3447.8283 NE 6
49937 1006068645 3079.22019 NW 3
49937 1006079840 3140.74704 NW 4
49937 1006068218 3822.73018 NW 8
49937 1006078290 3037.51165 SE 2
49937 1006078651 3798.06859 SE 7
49937 1027709052 3203.50433 SW 5
49937 1023930739 3203.50433 SW 5
49942 1006068571 3908.5835 NE 7
49942 1006067819 3588.20693 NW 6
49942 1006068190 1423.03127 SE 1
49942 1006068545 3960.37902 SE 8
49942 1006068182 2597.02849 SE 3
49942 1006068198 1792.30578 SW 2
49942 1006068032 2809.95267 SW 4
49942 1006068188 3394.99411 SW 5
49960 1025686933 1021.17188 NE 3
49960 1025778509 1262.06537 NE 8
49960 1026642145 1187.63041 NE 6
49960 1002919979 575.591869 NW 1
49960 1022362563 1078.89063 NW 4
49960 1006068208 1259.78292 SE 7
49960 1006067854 595.014286 SW 2
49960 1006067716 1167.87414 SW 5
49965 1006068379 573.098595 NE 3
49965 1006067673 1086.42993 NE 6
49965 1022362558 591.472738 NW 4
49965 1026642147 1239.35548 NW 7
49965 1021121940 1074.38029 SE 5
49965 1006068265 1253.18953 SE 8
49965 1002919999 276.06521 SW 1
49965 1025777349 300.026665 SW 2
49966 1006068143 574.381406 NE 5
49966 1006067881 1014.58563 NE 7
49966 1002919986 572.856876 NW 4
49966 1002919921 277.200289 SE 1
49966 1008378793 280.189222 SE 2
49966 1001235373 589.896601 SE 6
49966 1006068265 557.917557 SW 3
49966 1021121940 1121.31396 SW 8
49969 1002919979 1253.72166 NE 8
49969 1022362560 608.152941 NW 3
49969 1006067716 855.977803 SE 4
49969 1006067854 1024.65311 SE 5
49969 1008378866 566.39209 SW 1
49969 1027708907 1095.42777 SW 6
49969 1001235373 1251.9976 SW 7
49969 1027625616 591.959458 SW 2
49979 1006068577 531.497883 NE 3
49979 1006068477 1251.4128 NE 6
49979 1006068591 1255.48755 NW 7
49979 1013358584 614.18564 SE 4
49979 1019411297 1270.18739 SE 8
49979 1006812097 280.189222 SW 1
49979 1006067545 282.242803 SW 2
49979 1006068726 1042.14826 SW 5
AND ON AND ON....
68143 1028445209 1044.73585 NE 2
68143 1002950389 1708.98391 NE 5
68143 1006155639 2714.99411 NE 8
68143 1025830097 1073.93203 NE 3
68143 1006154723 574.464968 NW 1
68143 1002950475 2615.97898 SE 7
68143 1006155021 1442.30718 SW 4
68143 1028445211 1994.58467 SW 6
68167 1006155879 1637.48099 NE 5
68167 1001322873 1704.43246 NE 6
68167 1022009791 1951.95722 NW 8
68167 1006154897 1160.19007 SE 1
68167 1002950448 1427.05431 SE 3
68167 1006154895 1501.003 SE 4
68167 1006155257 1385.55873 SW 2
68167 1025830091 1778.85188 SW 7
68184 1006155899 618.684087 NE 2
68184 1006155931 1399.38915 NE 3
68184 1006154883 1695.66329 NE 8
68184 1006154895 1690.6274 NW 6
68184 1004752885 1492.82618 SE 4
68184 1027709638 1693.48664 SE 7
68184 1004752884 13.892444 SW 1
68184 1004752837 1620.26448 SW 5
68187 1006154875 1567.44537 NE 3
68187 1006155899 2200.32952 NE 7
68187 1006154895 101.828287 NW 1
68187 1006155879 1575.13079 NW 4
68187 1001322873 2265.00618 NW 8
68187 1006154897 423.398158 NW 2
68187 1004752884 1595.49271 SE 5
68187 1002950448 1993.30304 SW 6

4396 rows selected.

Elapsed: 00:00:25.04

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1402 Card=2330 Bytes
=160770)

1 0 VIEW (Cost=1402 Card=2330 Bytes=160770)
2 1 WINDOW (SORT) (Cost=1402 Card=2330 Bytes=130480)
3 2 VIEW (Cost=1329 Card=2330 Bytes=130480)
4 3 WINDOW (SORT PUSHED RANK) (Cost=1329 Card=2330 Bytes
=69900)

5 4 TABLE ACCESS (BY INDEX ROWID) OF 'WELLHEADERS_25MA
Y2004_PRJ' (Cost=2 Card=2 Bytes=32)

6 5 NESTED LOOPS (Cost=1264 Card=2330 Bytes=69900)
7 6 TABLE ACCESS (FULL) OF 'ACRES640_XYCENTER' (Co
st=72 Card=993 Bytes=13902)

8 6 INDEX (RANGE SCAN) OF 'WELLHEADERS_25_MAY2004_
LATLONG' (NON-UNIQUE) (Cost=6 Card=1690)





Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
65267 consistent gets
142 physical reads
0 redo size
183474 bytes sent via SQL*Net to client
2326 bytes received via SQL*Net from client
295 SQL*Net roundtrips to/from client
0 sorts (memory)
2 sorts (disk)
4396 rows processed

SQL>


THE NEXT STEP WOULD BE ACTUALLY DO THE INTERPOLATION....AVERAGE SOME VALUE WHERE
WELL KID = QUERY FROM ABOVE....


Questions/Comments?

Cheers,

Jeremy

Monday, March 06, 2006

Summary Statistics

One of my most used web pages is a page that summarizes a numeric column for min, max, standard deviation, count, and produces a 10 class equal binned histogram. One way to do this within Oracle is to execute two queries. The first query gives us the minimum and maximum value to establish how big our binned histogram buckets need to be. The second query counts the number of rows that fall within each bucket.

Take the following interactive map from the Kansas Well Top Stratigraphy Viewer:



The map above is displaying the depth of the Chase formation wells in Kansas relative to sea level. The statistical summary of that dataset can be viewed by selecting the "View Current Viewing Stats" link.



The first query gives us the min, max, standard deviation, and count.

SELECT avg(WELL_TOP_SEALEVEL) average,
min(WELL_TOP_SEALEVEL) minimum,
max(WELL_TOP_SEALEVEL) maximum,
stddev(WELL_TOP_SEALEVEL) standard_dev,
count(WELL_TOP_SEALEVEL) count
FROM PLSS.strat_well_top_1MAR2006
WHERE FORMATION_NAME = 'Chase Group'
and WELL_TOP is not null
and WELL_TOP <> 9999
and GROUND_ELEVATION > 0
and LONGITUDE >= -102.25798
and LONGITUDE <= -94.201246 and LATITUDE >= 34.4665145
and LATITUDE <= 42.5232570


All we are really doing is letting the database summarize (avg, min, max, etc.) a numeric column (WELL_TOP_SEALEVEL) based on the where criteria (the formation is Chase Group, the well value exists, the well ground elevation is above 0, and the well is within the current viewing extent).

Once we know the min and max value we can produce a query that counts the number of rows that fall within an equal 10 class range.


SELECT avg(WELL_TOP_SEALEVEL) average,
stddev(WELL_TOP_SEALEVEL) stddevvalue, '1' C0
,sum(case
when WELL_TOP_SEALEVEL between -643 AND -410.7
then 1
else 0
end) C1
,sum(case
when WELL_TOP_SEALEVEL between -410.7 AND -178.4
then 1
else 0
end) C2
,sum(case
when WELL_TOP_SEALEVEL between -178.4 AND 53.9
then 1
else 0
end) C3
,sum(case
when WELL_TOP_SEALEVEL between 53.9 AND 286.2
then 1
else 0
end) C4
,sum(case
when WELL_TOP_SEALEVEL between 286.2 AND 518.5
then 1
else 0
end) C5
,sum(case
when WELL_TOP_SEALEVEL between 518.5 AND 750.8
then 1
else 0
end) C6
,sum(case
when WELL_TOP_SEALEVEL between 750.8 AND 983.1
then 1
else 0
end) C7
,sum(case
when WELL_TOP_SEALEVEL between 983.1 AND 1215.4
then 1
else 0
end) C8
,sum(case
when WELL_TOP_SEALEVEL between 1215.4 AND 1447.7
then 1
else 0
end) C9
,sum(case
when WELL_TOP_SEALEVEL between 1447.7 AND 1680
then 1
else 0
end) C10
FROM PLSS.strat_well_top_1MAR2006
WHERE FORMATION_NAME = 'Chase Group'
AND WELL_TOP is not null
AND WELL_TOP <> 9999


Here we are using the case function on a row by row basis.

sum(case
when WELL_TOP_SEALEVEL between 1447.7 AND 1680
then 1
else 0
end)

If the value is between x and y then give it a value of 1, otherwise give it a value of 0. When we sum up that count for all rows we end up with a number of rows where the value is within the pre-defined bin.

Cheers,

Jeremy

Monday, February 06, 2006

The Power of the Spatial to Attribute Join-Part 3

Sorry for the long delay between posts!

Today we are going to look at multi-dimensional subsurface mapping. First a little background. The Kansas Geological Survey has an extensive database of subsurface tops. A geologic top is the depth in feet from the surface to a particular geologic unit (system, group, formation, member). This data is collected from numerous scientists working over many years. When a well is drilled (mostly for petroleum) geophysical logs are run in the hole to measure different properties of the rocks. These electric logs are scanned, studied, and geologic units are picked from the logs (Chase group, Permian system, Fort Scott limestone formation). Read this primer for more information.

So what does this mean for GIS and SQL? With this database we have access to over 1,700,000 called tops spread across ~150,000 wells in the state of Kansas. Each well can/will have more than one called formation. We have more than one record per geographic location--in this case the records represent mulitple entries (formations) in the z domain.

We can visualize some of this dataset in ArcScene (or any other 3D package).
Most analysis and visualization occurs when users map one unit of measure (depth to unit, thickness between units, or something else entirely) per geographic location.

The query to select an individual unit is pretty easy.

SQL> desc strat_well_top_1JUN2005
Name Null? Type
--------------------------------- -------- ------------------------
FORMATION_NAME NOT NULL VARCHAR2(60)
WELL_HEADER_KID NOT NULL NUMBER(10)
WELL_TOP NUMBER(8,2)
FIELD_KID NUMBER(10)
LATITUDE NUMBER(11,6)
LONGITUDE NUMBER(11,6)
GROUND_ELEVATION VARCHAR2(40)
GROUND_ELEVATION_SOURCE VARCHAR2(12)
WELL_TOP_SEALEVEL_NON_NED VARCHAR2(40)
WELL_TOP_SEALEVEL NUMBER

1 select well_header_kid, latitude, longitude,
2 well_top, well_top_sealevel
3 from strat_well_top_1JUN2005
4* where FORMATION_NAME = 'Permian System'
SQL> /


WELL_HEADER_KID LATITUDE LONGITUDE WELL_TOP WELL_TOP_SEALEVEL
--------------- ---------- ---------- ---------- -----------------
1025687130 37.58751 -101.36115 680 -2381
1028094976 37.61659 -100.73941 834 -2095
1006155619 37.39194 -101.67532 608 -2673
1006514758 39.99071 -97.14744 358 -1158
1025773538 37.20899 -101.90656 1395 -2120
1024492073 37.00136 -100.56991 3550 1170
1026599879 37.91949 -101.48087 660 -2529
1006054521 37.50601 -99.70401 387 -2066
1006054213 37.53783 -99.8804 417 -2098
1006054635 37.48215 -100.02145 412 -2179
1002915548 37.55853 -100.38357 775 -2008


This can be done easily outside of database. With ArcMap you can do a definition query to limit the tabluar dataset to only the unit in question.

With ArcIMS you can perform an ArcXML spatial query to only select the unit in question.
<LAYER type="featureclass" name="Depth" visible="true">
<DATASET fromlayer="1" />
<SPATIALQUERY searchorder="attributefirst"
where="PLSS.WELL_HEADERS_29AUG2003.KID = PLSS.STRAT_WELL_TOP_28AUG2003.WELL_HEADER_KID
and PLSS.STRAT_WELL_TOP_28AUG2003.FORMATION_NAME = 'Permian System'
and PLSS.STRAT_WELL_TOP_28AUG2003.WELL_TOP <> 9999
and PLSS.STRAT_WELL_TOP_28AUG2003.GROUND_ELEVATION > 0"
jointables="PLSS.STRAT_WELL_TOP_28AUG2003" />
<VALUEMAPRENDERER lookupfield="PLSS.STRAT_WELL_TOP_28AUG2003.WELL_TOP">
<RANGE lower="15" upper="326" label="15 TO 326">
<SIMPLEMARKERSYMBOL color="56,168,0" width="6" />
</RANGE>
<RANGE lower="326" upper="637" label="326 TO 637">
<SIMPLEMARKERSYMBOL color="90,186,0" width="6" />
</RANGE>
<RANGE lower="637" upper="949" label="637 TO 949">
<SIMPLEMARKERSYMBOL color="131,207,0" width="6" />
</RANGE>
<RANGE lower="949" upper="1260" label="949 TO 1260">
<SIMPLEMARKERSYMBOL color="176,224,0" width="6" />
</RANGE>
<RANGE lower="1260" upper="1571" label="1260 TO 1571">
<SIMPLEMARKERSYMBOL color="228,245,0" width="6" />
</RANGE>
<RANGE lower="1571" upper="1882" label="1571 TO 1882">
<SIMPLEMARKERSYMBOL color="255,225,0" width="6" />
</RANGE>
<RANGE lower="1882" upper="2193" label="1882 TO 2193">
<SIMPLEMARKERSYMBOL color="255,170,0" width="6" />
</RANGE>
<RANGE lower="2193" upper="2504" label="2193 TO 2504">
<SIMPLEMARKERSYMBOL color="255,115,0" width="6" />
</RANGE>
<RANGE lower="2504" upper="2816" label="2504 TO 2816">
<SIMPLEMARKERSYMBOL color="255,55,0" width="6" />
</RANGE>
<RANGE lower="2816" upper="3550" label="2816 TO 3550">
<SIMPLEMARKERSYMBOL color="255,0,0" width="6" />
</RANGE>
</VALUEMAPRENDERER>
</LAYER>

This looks like....


Lets get a little more complicated. Say you want to compute the thickness between two different formations. One way to do this is to select the two formations of interest, group the results, and select the difference between the min depth value and the max depth value.


select WELL_HEADER_KID, round(max(WELL_TOP) - min(WELL_TOP),1) STRAT_THICKNESS
from strat_well_top_1JUN2005
where (formation_name = 'Chase Group'
or formation_name = 'Council Grove Group')
and WELL_TOP is not null
and WELL_TOP <> 9999
and LONGITUDE >= -103.57371
and LONGITUDE <= -98.352965
and LATITUDE >= 34.9908216
and LATITUDE <= 40.2115692
group by WELL_HEADER_KID
having count(distinct formation_name) = 2
and max(WELL_TOP) - min(WELL_TOP) > 0
and max(WELL_TOP) - min(WELL_TOP) <> max(WELL_TOP)


By selecting only the Chase Group and the Council Grove Group we lower our tops table to records that have either the Chase or The Council Grove. Once we group by the well_header_kid and select records that have rows in both groups we can do the math to compute the thickness.
group by WELL_HEADER_KID 
having count(distinct formation_name) = 2

This drops records of wells that have one group or the other, but not both.

Now we can subtract the "min(well_top)" depth from the "max(well_top)" depth to compute the thickness between the two groups for a particluar well.

Slap that query to a view and join that in ArcIMS and you have nice point map of the thickness between two geologic groups.


Give it a try at: http://hercules.kgs.ku.edu/kgs/oilgas/strat_welltops/top_viewer.cfm

Or we can use ArcGIS server to generate a raster dataset off that that view (example application coming soon).

What do you think?

Cheers,

Jeremy

Thursday, November 10, 2005

The Power of the Spatial to Attribute Join-Part 2

This is part two on my series on the mapping and analysis of complex multi-dimensional datasets. Today we will look at historical sea surface temperatures (SST). I am by no means an expert on SST models our on the effect of warming sea surface temperatures on oceanic biological habitat. So take this example with a grain of salt...

IMPORTANCE: The number of consecutive records above the max baseline can adversely effect the benthic habitat of the oceanic environment. For example, El Nino can cause coral reef bleaching. This bleaching occurs because sea surface temperatures are continuously above the historical upper limit for the species.

GOAL: For every unique ID_ONEDG (unique spatial location), select the maximum number of CONSECUTIVE months that are above the long term average maximum yearly value.

EXAMPLE DATASET: HADL_SST
TABLE COMMENTS: Sea Surface Temperatures: OI version 2 SST Analysis from NCEP - Reynolds, Smith, and Stokes. One degree resolution. Monthly from 1871 to 1999. Extent: 40N Latitude to 40S Latitude No land. Modeled data.


SQL> desc hadl_sst
Name Null? Type
----------------------------------------- -------- ------------
ID_ONEDG NOT NULL NUMBER(6)
DATE_RECORDED NOT NULL DATE
SST_CENTIGRADE NOT NULL NUMBER(6,2)

Total number of records: 33676740


For speed purposes, for this example, we need to compute the max_baseline_value.

create table HADL_SST_AVG_YR_MAX_MNTH as (
select ID_ONEDG, avg(yearly_max) MAX_BASELINE_VALUE
from (select ID_ONEDG, max(SST_CENTIGRADE) yearly_max
from hadl_sst
group by ID_ONEDG, to_char(DATE_RECORDED, 'YYYY')
)
group by ID_ONEDG
);

SQL> desc HADL_SST_AVG_YR_MAX_MNTH
Name Null? Type
----------------------------------------- -------- ---------------------
ID_ONEDG NOT NULL NUMBER(6)
MAX_BASELINE_VALUE NUMBER

Total number of records: 21755


Spatial attribute table that contains spatial information about the center of the one degree cell (latitude and longitude). This is a table that is registered with ESRI's spatial database engine (SDE). This can be visualized with traditional GIS software.


SQL> desc one_degree
Name Null? Type
----------------------------------------- -------- --------------------
OBJECTID NOT NULL NUMBER(38)
LONG_1 NUMBER(6,1)
LAT_1 NUMBER(6,1)
ID_ONEDG NUMBER(6)
SHAPE NUMBER(38)

Total number of records: 64800


The objectives of my approach to this query are:
-Select all records that have a monthly temperature greater than the baseline over a predefined three year period centered on an elnino year (or la nina).
-Group the selected rows for each ID_ONEDG where the previous record is within a month AND has a temperature value greater than the baseline.

This is where we take advantage of the LAG/LEAD Analytical function provided by Oracle.


select A.ID_ONEDG ID_ONEDG,
a.date_recorded date_recorded,
case
when date_recorded -
lag(a.date_recorded)
over (order by a.ID_ONEDG, a.date_recorded)
not between 0 and 35
then rownum
end rn,
a.sst_centigrade sst_centigrade,
b.MAX_BASELINE_VALUE MAX_BASELINE_VALUE
from hadl_sst a, HADL_SST_AVG_YR_MAX_MNTH B, ONE_DEGREE C
where C.LAT_1 between -40 and 40
and B.ID_ONEDG = C.ID_ONEDG
and A.ID_ONEDG = B.ID_ONEDG
and A.date_recorded between
to_date('01-jan-1957', 'dd-mon-yyyy')
and to_date('31-dec-1959', 'dd-mon-yyyy')
and a.sst_centigrade > b.MAX_BASELINE_VALUE
order by a.ID_ONEDG, a.date_recorded


ID_ONEDG DATE_RECO RN SST_CENTIGRADE MAX_BASELINE_VALUE
---------- --------- ---------- -------------- ------------------
18041 01-AUG-98 35 20.47 20.1232558
18042 01-AUG-98 20.66 19.9818605
18043 01-AUG-98 20.7 19.8624806
18044 01-AUG-98 20.56 19.6863566
18044 01-SEP-98 19.87 19.6863566
18045 01-AUG-97 40 19.63 19.465969
18045 01-SEP-97 19.58 19.465969
18045 01-AUG-98 42 20.26 19.465969
18045 01-SEP-98 19.91 19.465969
18046 01-AUG-97 44 19.72 19.2837209
18046 01-SEP-97 19.65 19.2837209
18046 01-AUG-98 46 20.05 19.2837209
18046 01-SEP-98 19.87 19.2837209


Notice that ID_ONEDG of 18045 has two groups of consecutive months where the temperature is greater than the baseline value (Aug-97 to Sep-97 & Aug-98 to Sep-98).


-Once the groups have been recognized we need to carry the group id forward.

select ID_ONEDG,
nvl(max(rn)
over (order by ID_ONEDG,date_recorded),
-1) max_rn
from ( QUERY-FROM-ABOVE )
order by 1


ID_ONEDG MAX_RN
---------- ----------
18041 35
18042 35
18043 35
18044 35
18044 35
18045 40
18045 40
18045 42
18045 42
18046 44
18046 44
18046 46
18046 46


Notice that for ID_ONEDG of 18045 we have two groups each with two values (40 and 42).

-Now we group id_onedg and the max_rn to compute the count of records that have similar values for id_onedg and max_rn.


select ID_ONEDG, count(max_rn) consec_months
from ( QUERY-FROM-ABOVE )
group by ID_ONEDG, max_rn

ID_ONEDG CONSEC_MONTHS
---------- -------------
18041 1
18042 1
18043 1
18044 2
18045 2
18045 2
18046 2
18046 2



-Now that we have the count for all consecutive months above the baseline for each id_onedg we need to select the maximum number of consecutive months.


select max(consec_months) max_consecutive, ID_ONEDG
from ( QUERY-FROM-ABOVE )
group by ID_ONEDG

MAX_CONSECUTIVE ID_ONEDG
--------------- ----------
1 18041
1 18042
1 18043
2 18044
2 18045
2 18046



WRAP-UP: Analytical functions can be very powerful. The only way to do the previous query without analytical functions would be to build a PL/SQL procedure (email me if you want to see that version) that does the max consecutive computation for each spatial location.


select max(consec_months) max_consecutive, ID_ONEDG
from
(
select ID_ONEDG, count(max_rn) consec_months
from
(
select ID_ONEDG, nvl(max(rn)
over(order by ID_ONEDG,date_recorded), -1) max_rn
from
(
select A.ID_ONEDG ID_ONEDG,
a.date_recorded date_recorded,
case
when date_recorded - lag(a.date_recorded) over (order by a.ID_ONEDG, a.date_recorded) not between 0 and 35
then rownum
end rn,
a.sst_centigrade sst_centigrade,
b.MAX_BASELINE_VALUE MAX_BASELINE_VALUE
from hadl_sst a, HADL_SST_AVG_YR_MAX_MNTH B, ONE_DEGREE C
where C.LAT_1 between -40 and 40
and B.ID_ONEDG = C.ID_ONEDG
and A.ID_ONEDG = B.ID_ONEDG
and A.date_recorded between to_date('01-jan-1997', 'dd-mon-yyyy') and to_date('31-dec-1999', 'dd-mon-yyyy')
and a.sst_centigrade > b.MAX_BASELINE_VALUE
order by a.ID_ONEDG, a.date_recorded
)
order by 1
)
group by ID_ONEDG, max_rn
)
group by ID_ONEDG

21516 rows selected.

Elapsed: 00:00:24.06

So what does it look like?

Conesecutive months above the baseline maximum sea surface temperature for the three year period of 1979 to 1981 (1980 was considered a normal year) and the three year period of 1982 to 1984 (1984 was considered a El Nino period).


Conesecutive months above the baseline maximum sea surface temperature for the three year period of 1988 to 1990 (1989 was considered a La Nina year) and the three year period of 1996 to 1998 (1997 was considered a strong El Nino period).

Questions? Comments?

Cheers,

Jeremy

This page is powered by Blogger. Isn't yours?